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EDITORIAL 


The 1918 flu, 100 years later 


ombating a disease of unknown cause is a daunt- 
ing task. One hundred years ago, a pandemic of 
poorly understood etiology and transmissibility 
spread worldwide, causing an estimated 50 mil- 
lion deaths. Initially attributed to Haemophilus 
influenzae, it was not until the 1930s that an H1 
subtype was identified as the causative strain. 
Subsequent influenza pandemics in 1957, 1968, and 
2009 did not approach levels of 
morbidity and mortality compa- 
rable to those of the 1918 “Span- 
ish flu,’ leaving unanswered 
for almost a century questions 
regarding the extraordinary 
virulence and transmissibility of 
this unique strain. Technological 
advances made _ reconstruction 
of the 1918 virus possible; now, 
continued research, vaccine de- 
velopment, and preparedness 
are essential to ensure that such 
a devastating public health event 
is not repeated. 

Over the past 20 years, studies 
of individual genes and the fully 
reconstructed live 1918 virus 
have identified numerous fea- 
tures that likely contributed to 
its robustness and rapid global 
spread. Importantly, this re- 
search has often been conducted 
in tandem with viral isolates 
from recent human and zoonotic 
sources, enabling insights from 
the 1918 virus to inform evalua- 
tions of current pandemic risk. As we now know, wild 
birds are the natural reservoir for influenza A viruses. 
With extensive antigenic and genetic diversity inher- 
ent among influenza virus surface proteins, a strain to 
which humans are immunologically naive could jump 
the species barrier at any time. ACH5N1) viruses and, 
more recently, ACH7NQ) viruses, are two such examples. 
However, swine are also recognized as a “mixing vessel” 
for influenza viruses, and over the past two decades, 
there has been an increase in human cases following 
exposure to infected pigs. There is clearly, and alarm- 
ingly, a vast diversity of zoonotic sources of influenza A 
viruses that could acquire a transmissible phenotype in 
humans and cause a pandemic. 

What is our readiness today? Many international 
health agencies and research laboratories collaborate to 


“How, then, can 
we best study emerging 
pandemic threats?” 


track influenza virus evolution, evaluate antigenic drift 
among circulating and vaccine strains, and sequence 
viral genes to advance surveillance and preparedness. 
The production of improved vaccines and diagnostic 
tools, and better access to therapeutic agents represent 
resources that were not available a century ago. But in- 
fluenza viruses are moving targets, and a pandemic vi- 
rus could nevertheless emerge with as little warning in 
2018 as in 1918. As evidenced by 
this current flu season, influenza 
viruses can rapidly acquire muta- 
tions that evade our most recent 
vaccine formulations. A universal, 
broadly protective influenza vac- 
cine for seasonal epidemics—a 
goal of intense research efforts— 
would improve our preparedness 
for subsequent pandemics. 

How, then, can we best study 
emerging pandemic threats? 
Looking to the past, elucidating 
the role of specific molecular de- 
terminants that confer virulence 
and transmissibility of prior pan- 
demic viruses is one approach. 
But we must also look to the 
future. Advances in next-gen- 
eration sequencing are improv- 
ing our understanding of virus 
diversity. Investments in global 
partnerships and laboratory ca- 
pacity worldwide are strength- 
ening surveillance networks and 
diagnostic capabilities, and are 
also facilitating the identifica- 
tion of new viruses in humans and animals. The recent 
lifting of the U.S. moratorium on gain-of-function re- 
search on potential pandemic viruses further illustrates 
the contribution of unconventional, but responsible, re- 
search strategies to readiness. 

Philosopher George Santayana pointed out, “Those 
who cannot remember the past are condemned to re- 
peat it.” We are no doubt more prepared in 2018 for an 
infectious disease threat than in 1918. But it is critical 
to remember that preparation only stems from a global 
commitment to share data about viral isolates, support 
innovative research, and dedicate resources to assess 
the pandemic risk of new and emerging influenza vi- 
ruses from zoonotic reservoirs. 


-Jessica A. Belser and Terrence M. Tumpey 
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46 The practice of plunging live lobsters into boiling water, 
which is common in restaurants, is no longer permitted. 99 


A Swiss government order to take effect 1 March, reported by The Guardian, 
revived the debate about whether lobsters feel pain. 


Edited by Jeffrey Brainard 
Space telescope’s optics pass test 


The Johnson Space Center’s Chamber A, originally used in Apollo missions, tested the James Webb 
Space Telescope in space conditions. 


a 


ASA announced last week that the James Webb Space Telescope 
(JWST), its next big observatory, due for launch a year from now, 
successfully completed the final test of its optics and instru- 
ments. NASA workers inserted the 6.5-meter primary mirror 
of gold-coated beryllium, along with its secondary mirror and 
instruments, into a giant test chamber at the Johnson Space 
Center in Houston, Texas, last year. Over 100 days, engineers slowly 
cooled the hardware to -235°C in a vacuum to mimic conditions in 
space. They also shone simulated starlight through the mirror system 
into the detectors. The JWST team performed the test because it de- 
cided early on it didn’t want to “make the Hubble mistake,” says proj- 
ect scientist John Mather, referring to the grinding error that initially 
marred the optics of the Hubble Space Telescope after NASA launched 
it in 1990. “We’ve confirmed the primary mirror is a great mirror,’ he 
says. Meanwhile, at prime contractor Northrop Grumman in Redondo 
Beach, California, engineers tested the JWST’s tennis court-size, 
multilayered sunshield for the first time in October 2017. 
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AROUND THE WORLD 
‘Eugenics conference’ probed 


LONDON | University College London 
(UCL) has opened an inquiry into a 
low-profile conference on the genetics of 
intelligence that included discussions of 
eugenics and was attended by some people 
who hold controversial views on race and 
intelligence. The London Conference on 
Intelligence has been held three times since 
2014, according to a student newspaper 
that last week reported details of the small, 
invitation-only event. UCL says it did not 
endorse the meeting and administrators 
were not notified about the speakers and 
content, a requirement before conference 
rooms can be booked. James Thompson, 
an honorary lecturer in psychology at UCL, 
who hosted the conference, will not be 
allowed to organize conferences “of this 
nature” while the investigation is underway, 
and the university will examine how he 
received his lectureship. Thompson could 
not be reached for comment. 


German R&D budget may rise 


BERLIN | More than 3 months after voters 
went to the polls in Germany, it’s still 
unclear whether Chancellor Angela Merkel’s 
Christian Democrat party will form a 
government with the Social Democrats. But 
if the two end up in a coalition, scientists 
are likely to benefit. In a marathon negotia- 
tion session last week, the two political 
parties drew up a 28-page blueprint for a 
formal coalition. It includes a pledge to 
raise spending on R&D from 3% to 3.5% of 
gross domestic product (GDP) until 2025. 
That would put Germany’s spending on par 
with Japan’s. Only Israel and South Korea 
spend a higher percentage of their GDP 

on R&D. The parties also tackled the contro- 
versial herbicide glyphosate, agreeing 

to reduce its use in Germany. 


Deadly virus in Florida monkeys 


SILVER SPRINGS, FLORIDA | Scientists 
warn that bites or scratches from rhesus 
macaques roaming Silver Springs State 
Park in Florida could expose people to a 
deadly form of herpes. According to a study 
published online last week in Emerging 
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Infectious Diseases, many of the monkeys 
carry the herpes B virus, which can cause 
encephalitis and has killed almost half of 
all humans who have contracted it. So far, 
transmission of herpes B from macaques to 
people has been documented only in labora- 
tories. Researchers from the University 

of Florida in Gainesville and the University 
of Washington in Seattle found herpes B 

in about one in four blood samples taken 
from the macaques, and, worryingly, 

as many as 30% of monkeys in one family 
group also had the virus in their saliva. 

A dozen macaques were introduced into the 
park in the 1930s to encourage tourism, 
and today that population is about 175. 
Previous efforts to control the monkey pop- 
ulation have ended after public opposition. 


EPA says less about risk reviews 


WASHINGTON, D.c. | The US. 
Environmental Protection Agency (EPA) 

has curtailed the amount of information it 
releases on preliminary assessments 

of potentially hazardous new chemicals or 
new uses of existing chemicals, drawing 
complaints from environmental and govern- 
ment transparency advocates. Until recently, 
EPA routinely released notices on whether 

a preliminary review had concluded that 
the agency should scrutinize a new chemical 
or use for potential risks to human health 
or the environment. But on 5 January, the 
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agency revised the website that presented 
that information. It now indicates only 
that a “focus meeting occurred” regarding 
a new chemical or use, without reporting 
the outcome, E&E News reports. “Gosh, 
that’s helpful,” Richard Denison, senior 
scientist at the Environmental Defense 
Fund in Washington, D.C., sniped in a blog 
post. “This change dramatically limits the 
agency’s accountability to the public, not to 
mention transparency.” 


New highway slices the Amazon 


MANAUS, BRAZIL | The pending comple- 
tion of a new highway through the Brazilian 
Amazon is alarming tropical ecologists, 

who fear it will open the way to increased 
habitat destruction as well as illegal hunting 
and development. The 870-kilometer road, 
known as BR319, connects Manaus, Brazil, 
in central Amazonia to Porto Velho, Brazil, in 
southern Amazonia. The project has been on 
the books for almost a decade, but in recent 
months work has accelerated such that now 
only the central section remains to be paved. 
If that occurs, BR319 and a second highway 
“will slice the Amazon in half along a north- 
south axis ... like a flayed fish,” says ecologist 
William Laurance, director of the Centre for 
Tropical Environmental and Sustainability 
Science at James Cook University in Cairns, 
Australia. Some natural reserves have been 
created along the highway, but “it’s now or 


never in terms of increasing the measures to 
limit the highway’s impacts, or even halting 
the project,” says Laurance, who believes 
the project “is one of the biggest environ- 
mental issues for the planet in 2018.” 


Shared credit, but men first 


STORRS, CONNECTICUT | Papers with 
multiple authors more often list men as 
the first author, even when the male and 
female authors are noted as contributing 
equally, a study has found. Such imbal- 
ances may reflect conscious or unconscious 
biases, Nichole Broderick of the University 
of Connecticut in Storrs and Arturo 
Casadevall of Johns Hopkins University in 
Baltimore, Maryland, suggest in a preprint 
posted 31 December 2017 on the bioRxiv 
server. The disparity emerged when they 
analyzed about 3000 such articles on 
biomedicine published from 1995 to 2017 
in journals including Science, Nature, and 
PLOS Biology. Since 2007, the disparity 
has not been statistically significant, indi- 
cating that unequal treatment may have 
lessened in recent years. Because first- 
author credits can influence a scientist’s 
funding and career advancement, journals 
should ask authors to explain the order 

of listing, Broderick and Casadevall propose. 
Following their own advice, they explain 

in their paper that their author order is 
alphabetical and by increasing seniority. 


New spider species proliferate 


group of tiny arachnids called pelican spiders, 
named for their uncanny resemblance to 
the birds, uses beaklike mouthparts to spear 
her arachnids. Now, scientists have discov- 
8 new species of these spiders living in 
ar, they report in the 11 January issue of 
ZooKeys, doubling the number known to occur on the 
African island nation. The family of rice-size animals 
was first discovered in 1854 in a 50-million-year- 
old slab of amber. Scientists from the Smithsonian 
National Museum of Natural History in Washington, 
D.C., and the Natural History Museum of Denmark 
in Copenhagen identified the new species by looking 
at hundreds of pelican spiders under a microscope. 
Though all had “beaks” (left), some sported lon- 
ger mouthparts and “necks,” and others had more 
spines—a telltale sign that they were members of 
different species. Pelican spiders use their elongated 
mouthparts to snatch small spiders and hold them at 
a distance while they inject them with venom. It’s 
a nifty strategy, but Madagascar’s pelican spiders face 
an uncertain future as the island’s rainforests where 
they live are disappearing. The birdlike arachnids also 
prowl the forests of South Africa and Australia. 
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IN DEPTH 


ASTRONOMY 


The exoplanet B Pictoris 
b plows a path through 
its star’s disk of gas and 
dust in this illustration. 


Newborn exoplanet eyed for moons and rings 


Dedicated microsatellite joins telescopes watching for rare transit of 8 Pictoris b 


By Daniel Clery 


stronomers are staring at a nearby 

star in hopes of seeing a giant baby 

of a planet pass across its face, per- 

haps accompanied by dust clouds, 

rings, or newborn moons. Last week, 

the newest and tiniest telescope 
joined the vigil, when the French-built Pic- 
Sat rode into orbit on an Indian rocket. It 
will be able to continuously monitor the 
star, 8 Pictoris, until chances of seeing the 
once-in-20-year transit event diminish in a 
few months’ time. “We can’t miss this. We 
would be kicking ourselves,” says astrono- 
mer Matthew Kenworthy of Leiden Univer- 
sity in the Netherlands. 

Astronomers have seen thousands of 
exoplanets transit, or cross the face of their 
stars, eclipsing a fraction of their light. But 
8 Pictoris, a bright star just 63 light-years 
away, is special. It is a natural laboratory for 
how solar systems form because it is only 
24 million years old—the “equivalent of a 
baby of a few weeks,” says Sylvestre Lacour 
of the Paris Observatory. 

In 1984, astronomers observed a disk of gas 
and dust around it, the first protoplanetary 
disk to be seen (Science, 21 December 1984, 
p. 1421). The disk, viewed nearly edge on, was 
warped and had gaps, a sign of planets in 
the making. But it wasn’t until 2009 that re- 
searchers spied the faint glow of a hot, young 
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giant planet, 10 times the mass of Jupiter, in a 
roughly 20-year orbit. Now dubbed 8 Pictoris 
b, itis one of only a handful of exoplanets to be 
imaged directly. 

The discovery could explain why, in 1981, 
8 Pictoris’s light dimmed erratically by up 
to 6% over 2 weeks, then brightened again. 
Another transit may have passed unnoticed 
2 decades later, and the newfound planet 
appeared to be heading for yet another one 
in 2017 or 2018. Recent calculations sug- 
gest that the transit will be a near miss. 
But the planet’s large “Hill sphere’—a zone 
of gravitational influence that may contain 
planetary rings, clouds of material, or newly 
formed moons—may yet reveal itself in dips 
in the light of B Pictoris. 

To catch those dips, astronomers needed 
to monitor the star 24 hours a day over most 
of a year—too big a commitment for most 
observatories. So Lacour and his colleagues 
decided to build a small one of their own. In 
3 years, with €1.5 million from the European 
Research Council, they built PicSat, a 5-centi- 
meter space telescope in a satellite only 
slightly larger than a toaster. “It was risky 
and not everyone believed in it,’ Lacour says. 

Kenworthy, along with Eric Mamajek 
of NASA’ Jet Propulsion Laboratory in 
Pasadena, California, decided to observe 
from the ground. They built two washing 
machine-size robotic observatories, dubbed 
bRing, sited in Australia and South Africa. 
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A few existing telescopes have also 
joined the hunt: the Bright Target Ex- 
plorer Constellation, five microsatellites 
designed to study luminous stars; and a 
40-centimeter telescope that’s part of 
the Antarctic Search for Transiting Exo- 
Planets, which can watch B Pictoris contin- 
uously during the darkness of the southern 
winter. “We had to ensure we didn’t drop 
the ball and had at least one scope on the 
star during the transit,’ Kenworthy says. 
The researchers also lined up agreements 
with larger telescopes to swing into action 
if they did see something. 

PicSat, scheduled for launch in Septem- 
ber 2017 but delayed by a launcher fail- 
ure, finally reached orbit on 12 January. 
The Paris team is now checking its health. 
Sadly, it is joining the party toward its end; 
the transit of the Hill sphere is expected to 
end in February. “Maybe we won’t see any- 
thing. We knew from the start it was risky,” 
Lacour says. 

Kenworthy remains optimistic. “We're 
not discounting anything,” he says. Even a 
null result will imply that just 24 million 
years after its birth, the baby planet has al- 
ready cleared out its Hill sphere. And once 
the predicted transit of 8 Pictoris b is over, 
the astronomers will keep watching the 
star and its planet nursery, hoping to see 
something else, like the fleeting transit of a 
smaller baby planet. 
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DIAGNOSTICS 


‘Liquid biopsy’ for cancer 
promises early detection 


Combining DNA and protein markers brings researchers 
closer to a universal cancer screening test 


By Jocelyn Kaiser 


team of researchers has taken a ma- 

jor step toward one of the hottest 

goals in cancer research: a blood 

test that can detect tumors early. 

Their new test, which examines 

cancer-related DNA and proteins in 
the blood, yielded a positive result about 
70% of the time across eight common can- 
cer types in more than 1000 patients whose 
tumors had not yet spread—among the best 
performances yet for a universal cancer 
blood test. It also narrowed down the form 
of cancer, which previously published pan- 
cancer blood tests have not. 

The work, reported online today in 
Science, could one day lead to a tool for 
routinely screening people and catching 
tumors before they cause symptoms, when 
chances are best for a cure. Other groups, 
among them startups with more than 
$1 billion in funding, are already pursuing 
that prospect. The new result could put the 
team, led by Nickolas Papadopoulos, Bert 
Vogelstein, and others at Johns Hopkins 
University in Baltimore, Maryland, among 
the front-runners. 

“The clever part is to couple DNA with 
proteins,’ says cancer researcher Alberto 
Bardelli of the University of Turin in Italy, 
who was not involved in the work. The re- 
searchers have already begun a large study 
to see whether the test can pick up tumors 
in seemingly cancer-free women. 


Genetic mutations drive the growth of 
cancer cells, and dying cells shed some of 
this mutated DNA into the blood. The Johns 
Hopkins group and others have shown that 
so-called liquid biopsies of blood-borne tu- 
mor DNA can reveal, for example, whether 
a patient’s cancer should respond to a spe- 
cific drug. But detecting the scant DNA 
released by early stage tumors is still chal- 
lenging. Companies such as the $1 billion 
Grail, launched in 2016 by sequencing giant 
Illumina, are using a big data approach, se- 
quencing hundreds of genes in thousands of 
cancer patients’ blood in search of a defini- 
tive set of DNA markers. 

The Johns Hopkins researchers and 
collaborators found that gains in the de- 
tection rate tailed off when they added 
more genes to their test. They decided to 
sequence parts of just 16 genes often mu- 
tated in different types of cancer. They then 
added eight known protein biomarkers 
characteristic of specific kinds of cancer. 
This bumped up sensitivity and allowed 
the team to home in on the tissue type of 
the tumor. 

In blood samples from 1005 patients 
with eight types of tumors that had evi- 
dently not yet metastasized, the test de- 
tected between 33% and 98% of cases, 
depending on the tumor type (see graph, 
below). The sensitivity was 69% or higher 
for ovarian, liver, stomach, pancreatic, and 
esophageal cancers—all types that are dif- 
ficult to detect early. 


A screening scorecard 


Anew cancer blood test worked better for some types than others, and caught only 43% of stage 1 cancers. 


(Error bars represent 95% confidence intervals.) 
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The test rarely found cancer that wasn’t 
there. Only seven of 812, or less than 1%, 
of healthy controls tested positive. And 
the test, called CancerSEEK, narrowed the 
origin of the cancer to two possible sites in 
about 80% of patients. The team, which is 
applying for patents on CancerSEEK, esti- 
mates the cost at less than $500 per sample. 
“That’s a very attractive number,” says mo- 
lecular pathologist Anirban Maitra of the 
MD Anderson Cancer Center in Houston, 
Texas, because it is in the range of other 
cancer screening tests such as colonoscopy. 

Maitra and others point to caveats, how- 
ever. One is that the cancer-related pro- 
teins used by the test reflect tissue damage 
and can also appear in people with inflam- 
matory diseases such as arthritis. That 
means the 1% false positive rate will likely 
be higher in less healthy populations, 
notes proteomics researcher Lance Liotta 
of George Mason University in Manassas, 
Virginia. What’s more, the 1005 patients al- 
ready had cancer symptoms; CancerSEEK 
probably won’t work as well in asymptom- 
atic patients whose smaller tumors may 
shed less DNA. In fact, the test picked up 
only 43% of very early, stage 1 cancers. 
“We're still not there yet,” Bardelli says. 

The Johns Hopkins team thinks Cancer- 
SEEK is ready for testing as a screening 
tool. “A test does not have to be perfect to be 
useful,” Papadopoulos says. In collaboration 
with Johns Hopkins, the Geisinger Health 
System in Pennsylvania has already begun 
to use CancerSEEK on blood samples from 
female volunteers between ages 65 and 75 
who have never had cancer. The planned 
$50 million, 5-year study of up to 50,000 
women is being funded by a private philan- 
thropic group, The Marcus Foundation. 

For those who test positive twice, the next 
step will be imaging to find the tumor. But 
that will bring up questions raised by other 
screening tests. Will the test pick up small 
tumors that would never grow large enough 
to cause problems yet will be treated any- 
way, at unnecessary cost, risk, and anxiety 
to the patient? Papadopoulos thinks the 
problem is manageable because an expert 
team will assess each case. “The issue is not 
overdiagnosis, but overtreatment,” he says. 

Still, others working on liquid biopsies 
say that it will take time to figure out 
whether widespread screening of healthy 
people with a universal blood test can re- 
duce cancer deaths without doing harm. 
“If people expect to suddenly catch all can- 
cers, they’ll be disappointed,” says cancer 
researcher Nitzan Rosenfeld of the Univer- 
sity of Cambridge in the United Kingdom. 
“This is exciting progress,” he says. “But 
evaluating it in the real world will be a 
long process.” & 
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EVOLUTION 


Tamed tummune reaction aids pregnancy 


Evolutionary studies show how dialing back inflammation allows embryo implantation 


By Elizabeth Pennisi, 
in San Francisco, California 


he riskiest moment in any human 
pregnancy is arguably when the fer- 
tilized egg attaches to the womb wall 
and tries to establish a lifeline be- 
tween embryo and mother. About half 
of in vitro pregnancies fail during this 
implantation stage, and many natural preg- 
nancies end then as well. Now, research- 
ers comparing pregnancy in opossums and 
several other mammals have shown how 
precise control of an immune process, in- 
flammation, is critical to success or failure. 
In work reported here this month at the 
annual meeting of the Society for Integrative 


. _ = 
opossum young early in their development. 


and Comparative Biology, a Yale University 
team led by evolutionary developmental bio- 
logist Ginter Wagner concluded that human 
and other so-called placental mammals have 
tweaked an ancient inflammatory process to 
enable embryos to implant and persist in the 
womb. Placental mammals—named for the 
mass of tissue in the uterus that serves as the 
interface between mother and fetus—have 
specialized uterine cells that suppress the 
release of a key immune-stimulating mol- 
ecule. This suppression may help delay the 
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An unrestrained inflammatory response triggers the birth of 


rejection of the embryo until it’s ready to be 
born, Arun Chavan, a Yale graduate student 
in Wagner’s lab, told the meeting. 

Beyond solving a key mystery about preg- 
nancy, the work could also point to treat- 
ments for infertility and miscarriage, says 
Tom Stewart, an evolutionary developmen- 
tal biologist at the University of Chicago in 
Illinois. “The more we understand about 
pregnancy in other species, the more likely 
it is that we can treat medical issues that 
arise during human pregnancy.” 

Researchers have always puzzled over 
why the mother allows an embryo, which 
is basically a parasite, to settle in and grow. 
Yet implantation “was a critical first step in 
evolving pregnancy as humans experience 
it,’ says Julia Bowsher, an inte- 
grative biologist at North Da- 
kota State University in Fargo. 

This seeming paradox is 
even more perplexing because 
although a mother’s inflamma- 
tory reaction to this “parasite” 
is the biggest threat to preg- 
nancy, it also seems necessary 
for the pregnancy to be success- 
ful, Wagner, Chavan, and Yale 
postdoc Oliver Griffith pointed 
out last year. A woman’s chance 
of implantation actually in- 
creases if her uterus has suf- 
fered mild trauma, for example, 
from a uterine biopsy as part 
of an in vitro fertilization (IVF) 
procedure. Studies have shown 
that the IVF embryo is more 
likely to settle in, particularly 
at the biopsy site. Furthermore, 
an immune “rejection” response 
helps create the contractions 
necessary for a baby’s birth. Yet 
in between implantation and 
birth, the immune system is 
held in check, allowing the fetus to thrive. 

To understand the evolutionary basis for 
this interlude, Griffith recently led a study 
of gene activity in a marsupial, the gray 
short-tailed opossum (Monodelphis domes- 
tica). Marsupials have very short pregnan- 
cies. Early opossum embryos develop for 
about 12 days, enclosed as shelled eggs in 
the womb. They then shed their shells and 
try to attach to the uterine wall, activating 
placenta-promoting genes. But after about 
2 days, the mother’s immune system “re- 


Published by AAAS 


jects” the embryos, causing the birth of a 
litter that is still very immature compared 
with newborn placental mammals. 

Griffith sampled opossum gene activ- 
ity before pregnancy, during the egg-shell 
stage, and after implantation. The analysis 
revealed the array of immune system signal- 
ing molecules and steroid hormones taking 
part in the immune attack on the embryo. 
The gene activity also pointed to a role for 
immune cells such as neutrophils, which 
launch a full-fledged inflammatory reac- 
tion that includes molecules that stimulate 
contractions of the uterus. The timing and 
makeup of this response largely mirror what 
is seen in implantation in placental mam- 
mals, indicating that the process evolved 
in the common ancestor of placental and 
marsupial mammals, Griffith and colleagues 
reported in the 26 July 2017 issue of the Pro- 
ceedings of the National Academy of Sciences. 

But later in evolution, placental mammals 
apparently dialed back that inflammation to 
allow extended gestation. To find out how, 
Chavan compared implantation in the opos- 
sum with that in a range of placental mam- 
mals: rabbits, armadillos, and hyraxes, a 
3-kilogram rodentlike mammal that’s closely 
related to elephants. Based on studies of 
gene activity and immune cells, he found 
that these mammals have “domesticated” 
implantation’s inflammatory response. At 
the implantation site, blood vessels prolifer- 
ate in the uterine wall—the same hallmark 
of inflammation seen in the opossum—but 
the signaling molecule IL-17, which recruits 
neutrophils, is missing, Chavan reported at 
the meeting. 

Specialized cells called decidual cells 
seem to be responsible, he found. These 
cells form in the uterine lining early in 
pregnancy and, in many placental mam- 
mals, disappear right after implantation. 
Chavan wondered whether these cells 
might have evolved to switch the inflam- 
matory response into low gear. Supporting 
that notion, he found in tissue studies that 
secretions of those cells could keep immune 
cells from making IL-17. 

“If that switch doesn’t happen, there are 
miscarriages,” says Gil Mor, a reproductive 
immunologist at Yale who was not involved 
with the work. “Understanding the evolu- 
tion of decidual cells will be extremely help- 
ful to those of us studying the nitty-gritty” 
of pregnancy. & 
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FISHERIES 


Tensions flare over electric 
fishing in European waters 


European Parliament calls for total ban of a technique 
that saves fuel and reduces damage to marine life 


By Erik Stokstad 


ottom trawling is one of the most de- 

structive types of fishing, decried for 

churning up massive swaths of sea 

bed and leaving dead sea urchins, 

mollusks, and other creatures in its 

wake. In the North Sea, Dutch fishing 
vessels are substituting a subtler technique 
for this brute-force method: using short 
bursts of electricity to get flatfish out of the 
sediment and into nets. But they are stir- 
ring up just as much controversy. 

Dutch fishing companies say pulse trawl- 
ing is less damaging to marine ecosystems 
and saves energy. But fishing groups in 
other EU countries are increasingly angry 
about competition from the Dutch pulse 
trawlers. And a coalition of environmental 
organizations worries about harm to non- 
target marine life. Other nongovernmental 
organizations, including Greenpeace Neth- 
erlands, say pulse trawling has promise and 
that ending it now would penalize the fish- 
ing industry for innovating. 

“A train of emotion is now go- 
ing full speed,” says Marloes Kraan, an 
anthropologist at Wageningen Marine Re- 
search in IJmuiden, the Netherlands. And it 
appears likely to accelerate: On 16 January, 
the European Parliament voted to ban the 
technique as the first step in negotiations 
with the European Commission and mem- 
ber states over a large package of fisheries 
reforms. A ban, or even a major reduction 
in pulse trawling, would be a huge blow to 
the Dutch fishing industry. 

Most bottom trawlers drag a net, held 
open by a wide metal beam, across the bot- 
tom to catch shrimp or fish. Trawlers target- 
ing flatfish, such as sole or plaice, also use 
dangling iron chains to scare them out of 
the sediment. The beam and chains disturb 
or kill many bottom-dwelling organisms, 
the nets catch unwanted species, and all the 
tugging requires a lot of diesel. 

Pulse trawlers, by contrast, barely touch 
the bottom because they use bursts of low- 
voltage electricity to catch flatfish, particu- 
larly Dover sole (see graphic, right). After 
the current briefly cramps their muscles, 
they try to flee, and many end up in the net. 
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Because sole are more susceptible to elec- 
tricity than other species, pulse trawling 
reduces bycatch. And the gear is lighter and 
can be towed slower, so the boats burn half 
as much fuel. “We catch with a lesser en- 
vironmental impact and greater economic 
returns,” says Pim Visser of VisNed, a trawl- 
ing trade group in Urk, the Netherlands. He 
credits the gear with saving many fishing 
companies from bankruptcy. 

Encouraged by initial studies, the Dutch 
government in 2006 successfully lobbied 
the European Commission to allow 5% 
of each country’s fleet to use pulse trawl- 
ing, exempting them from the European 
Union’s 1988 general ban on electrical 
fishing. By 2009, Dutch companies had 
embraced the opportunity. As demand grew, 
they received additional licenses for reduc- 
ing bycatch or for research, with the condi- 
tion that they provide detailed data on their 
catches. Now, 75 vessels—about 28% of Dutch 


A charged approach 
Many Dutch fishing vessels have 
adopted electric pulse trawling, but 
competitors and some environmental 
groups object. 
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In pulse trawling, a wing- 
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trawlers—use pulse gear. Fishing companies 
outside the Netherlands fish for sole, too, 
but don’t specialize in it; as a result, few 
have invested in the expensive technology. 

BLOOM Association, an environmental 
group in Paris, argues that the research and 
bycatch licenses are illegal and a guise for 
commercial fishing, and that pulse trawling 
puts small-scale fishing at an even bigger 
disadvantage than conventional trawling 
does. BLOOM advocates catching flatfish 
with gillnets, stationary curtains of netting 
that have a much lower bycatch rate than 
either kind of trawling and do less damage 
to the sea floor. “There shouldn’t be any use 
of electric current,’ says BLOOM Director 
Claire Nouvian. “We’ve got enough evidence 
to know this is nonsense.” 

Scientists have so far found little evi- 
dence that the electrical currents cause se- 
rious harm. Last year, a working group with 
the International Council for the Explora- 
tion of the Sea (ICES) highlighted harm to 
large cod and whiting as the only known 
irreversible effect. Although not many cod 
are accidentally caught by pulse trawlers, 
about 10% of them suffer vertebral fractures 
and hemorrhages when their muscles over- 
contract from the shocks. Initial laboratory 
research on other organisms has not shown 
lasting, serious effects, but the ICES group 
says questions remain, for instance about 
the effects on sharks and rays. 

Nevertheless, “We know enough to con- 
tinue with pulse trawling in the present 
context,” says Adriaan Rijnsdorp, a fisheries 
biologist at Wageningen Marine Research 
and a co-chair of the ICES working group. 
But he says a decision on the future of pulse 
trawling should wait until 2019, when a 
4-year, EU-funded research program on 
ecological impacts, which he coordinates, is 
due to wrap up. 

Any decision will have to be agreed on by 
the European Parliament, the commission, 
and member states, in this case represented 
by their fisheries ministers. The commission 
has proposed removing the cap on licenses in 
the southern North Sea, where pulse trawl- 
ing now occurs; other areas could follow 
after further studies. The ministers, by con- 
trast, would de facto remove licenses beyond 
the 5% limit of a country’s fleet, which would 
force most Dutch vessels to give up pulse 
trawling. A compromise in which the tech- 
nique is greatly curtailed is the most likely 
outcome, says Irene Kingma, director of the 
Dutch Elasmobranch Society in Amsterdam, 
which promotes the study and conservation 
of sharks and rays. “There might be carnage 
within the Dutch fishing sector,” Kingma 
says. “And if they change back to beam trawl- 
ing, we have all the environmental problems 
from that.” & 
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SCIENTIFIC COMMUNITY 


Rochester roiled by fallout 
from sexual harassment case 


Report supports university's handling of explosive charges 
involving linguist T: Florian Jaeger, but president bows out 


By Meredith Wadman 


fter months of turmoil that has riven 

the University of Rochester in New 

York and its highly esteemed cogni- 

tive science department, the contro- 

versy took a pivotal turn last week. 

An outside attorney hired by the uni- 
versity released a 207-page investigation that 
largely approved of the university’s handling 
of explosive sexual harassment claims, which 
have sparked formal complaints, campus 
protests, and a boycott effort. The report also 
concluded that no individual woman had 
experiences with professor T. Florian Jaeger 
that met what it described as the “demand- 
ing” legal standard for sexual harassment, 
which is that it is severe or pervasive enough 
to create a hostile environment. 

The turmoil is far from over. The university 
still faces a federal lawsuit, filed last month 
by seven current or former faculty, a former 
postdoc, and a former graduate student. They 
allege that the university allowed Jaeger to 
create a hostile environment and retaliated 
against them when they complained. “This 
is what it looks like when an institution pro- 
tects a sexual harasser,’ says plaintiff Steven 
Piantadosi, an assistant professor in the De- 
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partment of Brain and Cognitive Sciences 
(BCS). Adding to the uncertainty, just as the 
report was released on 11 January, university 
head Joel Seligman, announced he was leav- 
ing, effective 28 February. In a statement, 
Seligman, who is a defendant in the lawsuit, 
said, “It is clear to me that the best interests 
of the University are best served with new 
leadership, and a fresh perspective to focus 
on healing our campus.” As for Jaeger, he 
declined through his lawyer to say whether 
he has discussed his employment status with 
the university. 

The report was written by Mary Jo White, 
a partner at the law firm Debevoise & Plimp- 
ton in New York City who is a former U.S. 
attorney and former chair of the Securities 
and Exchange Commission. The university’s 
Board of Trustees hired her in September 
2017 to investigate complaints about Jaeger, 
as well as the university’s previous probes of 
the case and the claims of retaliation. (The 
university paid her firm $4.5 million for 
its work.) She concluded that although the 
behavior of Jaeger, a professor in the BCS 
department, was “offensive,” “inappropri- 
ate” and “disturbing,” he violated neither 
university policies then in place nor federal 
or state sexual harassment laws. 
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Last fall, campus protesters targeted University 
of Rochester President Joel Seligman. 


The report concedes that Jaeger sent an 
unwanted photo of his penis to a student; 
made inappropriate sexual comments in 
social and academic settings, including com- 
menting on a student’s attractiveness in front 
of peers and faculty; flirted with students; 
served on a thesis committee for a student he 
had had sex with; submitted a letter of rec- 
ommendation for an undergraduate while in 
a sexual relationship with her; and engaged 
in multiple consensual relationships with 
current, former, or prospective students. “A 
number of female graduate students from 
that time period told us that, as a result of 
Jaeger’s reputation or behavior, they made a 
conscious decision to avoid him and the edu- 
cational opportunities he offered, which we 
found to be very troubling,” the report states. 

But the report finds that Jaeger did not 
sexually harass any individual woman or 
breach university policies in place at the 
time. Jaeger arrived at the university in 2007, 
and the complaints center on his behavior 
from that time until 2013. The report found 
that he engaged in no sexual relationships 
with current or former students after 2011. 
The university did not bar intimate relation- 
ships between faculty and undergraduates or 
between faculty and graduate students over 
whom they have authority until 2014. 

White’s team at Debevoise & Plimpton, 
where her group is deployed to help institu- 
tions in crisis, also found that the school's 
investigations of Jaeger’s behavior were 
impartial and that no retaliation occurred. 
It called the accounts by Jaeger’s accusers 
“exaggerated and misleading in many re- 
spects.” It adds: “We think that the Univer- 
sity acted in good faith and appropriately 
under its then-current policies and that the 
steps it took in an effort to navigate an un- 
usually difficult situation were reasonable.” 

However, the report does conclude that 
the school handled some matters poorly, 
such as sharing complaining faculty mem- 
bers’ private emails with the BCS chair and 
promoting Jaeger to full professor in 2016, 
while he was under investigation. It's poli- 
cies and procedures for addressing sexual 
harassment complaints “can and should be 
enhanced,” the report states. 

Jaeger responded in a statement: “This 
report does not exonerate me, but neither 
does it give merit to many of the worst ac- 
cusations made against me. ... I appreciate 
their commitment to seeking out the truth.” 

Some of his colleagues agree. “The White 
report largely hits the mark and affirms 
what the majority of my faculty colleagues 
have been thinking and experienced,” Ralf 
Haefner, an assistant professor in the BCS 
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department since 2014, wrote in an email to 
Science. “While some of Jaeger’s behavior as 
a junior faculty member was inappropriate, 
he clearly wasn’t a sexual predator.” 

The complainants, who refused to be 
interviewed by White’s firm because of 
the ongoing litigation, strongly disagreed. 
“The thrust of their report is to admit that 
many bad things happened at UR—but 
miraculously ... no legal liability attaches 
to the university,’ the plantiffs’ lawyer, Ann 
Olivarius of McAllister Olivarius in Maid- 
enhead, U.K., said at an 11 January news 
conference. “In fact, there is substantial 
case law the report ignores that strongly 
supports the idea that the university is ab- 
solutely liable for the hostile environment 
created by Jaeger’s actions.” 

The report does not evaluate the ef- 
fect of Jaeger’s behavior in the aggregate, 
which the court might do, notes Alexandra 
Tracy-Ramirez, a lawyer at HopkinsWay 
in Phoenix. The plaintiffs also dispute de- 
tails of White’s account. On 13 January, 
they wrote to the faculty Senate Executive 
Committee, saying the report suppresses 
and misrepresents evidence. “In court, our 
audio recordings ... will establish blatant 
factual inaccuracies in Ms. White’s narra- 
tive,” they add. Their suit names the uni- 
versity, Seligman, and Provost Robert Clark. 
In addition to retaliation and defamation, 
it charges that the school allowed a hostile 
environment for three plaintiffs. University 
attorneys are expected to file their first re- 
sponse in early February. 

The BCS department now faces the task 
of healing and rebuilding. Of the nine 
plaintiffs, seven have left or will soon leave 
the university. Two, Piantadosi and Celeste 
Kidd, assistant professors who are married 
to each other, are searching for positions. 
The BCS website lists 40 remaining faculty. 

Recruiting students may be difficult, 
given an open letter signed late last year 
by 454 professors in cognitive neurosci- 
ence and other disciplines. The signatories 
vowed not to encourage students to attend 
the university because it had “abrogated” its 
duty to students “by supporting the preda- 
tor and intimidating the victims.” 

Few dispute that the department has 
been damaged by the case. “It was known 
as one of the top cognitive neuroscience 
departments in the country,” says cogni- 
tive neuroscientist Timothy Verstynen of 
Carnegie Mellon University in Pittsburgh, 
Pennsylvania, who signed the letter. “They 
really did take a hit by losing some strong 
key faculty.” The Board of Trustees said last 
week it will need time to digest the lengthy 
report before taking action. It also faces the 
task of replacing Seligman, whom it called a 
“prilliant, transformative leader.” 
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CRIMINAL JUSTICE 


Are algorithms 


good judges? 


People are as good as machines in predicting rearrest 


By Catherine Matacic 


very day, judges across the United 
States face an important decision: 
Should they jail a defendant whose 
innocence or guilt has not yet been 
determined, or should they release 
that person back into the community, 
where he or she might commit a crime? 

Increasingly, courts are turning to comp- 
uter-based tools to help make those deci- 
sions, lured by the promise of complex 
algorithms that use an array of factors to 
spit out risk scores. But a new study sug- 
gests that at least one widely used algo- 
rithm produces risk assessments that are no 
better than those reached by people given 
just a few key pieces of infor- 
mation about a defendant. 

The result, published this 
week in Science Advances, 
challenges widespread as- 
sumptions that algorithms 
are better at calculating risk 
than humans, researchers 
say. “A fancy model isn’t nec- 
essarily a better model,” says 
David Robinson, a legal scholar at George- 
town University in Washington, D.C. 

Julia Dressel, a computer science major 
at Dartmouth College, got interested in risk 
scoring algorithms after reading a 2016 se- 
ries by investigative reporters at ProPublica 
about one popular system, the Correctional 
Offender Management Profiling for Alter- 
native Sanctions (COMPAS), which is used 
in at least five states and can cost up to 
$22,000 a year. After examining COMPAS 
scores for 10,000 defendants awaiting trial 
in Broward County in Florida, as well as 
their arrest records over the next 2 years, 
ProPublica concluded that the tool dispro- 
portionately classified black offenders as 
being at high risk of rearrest. 

That finding concerned civil rights advo- 
cates, but Dressel went on to ask another 
question: Are humans or machines better 
at assessing the risk of rearrest? To find 
out, she randomly selected 1000 of the de- 
fendants and recorded seven pieces of in- 
formation about each, including their age, 
sex, and number of previous arrests. She 
then recruited 400 people using the online 
crowdsourcing service Amazon Mechanical 
Turk. Each received profiles of 50 defen- 
dants and was asked to predict whether 
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“A fancy model 
isn’t necessarily 
a better model.” 


David Robinson, 


they would be rearrested within 2 years, the 
same standard COMPAS uses. The human 
judges were right about 63% to 67% of the 
time, compared with about 65% of the time 
for COMPAS. 

Dressel was surprised. So was Megan 
Stevenson, an economist and legal scholar 
at George Mason University in Arlington, 
Virginia, who calls the study the first to run 
a “horse race” between human and algo- 
rithm. She always assumed that algorithms 
were somewhat better, so the results left her 
“quite shocked.” 

In a second experiment, Dressel and 
her adviser, Dartmouth computer scientist 
Hany Farid, explored whether a simpler 
algorithm could beat COMPAS’s, which is 
proprietary but described 
in technical literature. They 
created their own, ultimately 
settling on just two factors: 
age and number of prior 
convictions. Plugging that 
information into a simple for- 
mula yielded predictions that 
were about 67% accurate, 
roughly matching COMPAS. 
Robinson says that result reflects some- 
thing that has long been known in crimi- 
nology: If you’re young, youre risky. 

Mathematician Tim Brennan, who cre- 
ated COMPAS in 1998 when working at 
Northpointe (now Equivant) in Canton, 
Ohio, says that far from undercutting COM- 
PAS, the new study validates his approach. 
Seventy percent accuracy, he says, has long 
been considered the “speed limit” of such 
prediction systems, and the fact that hu- 
mans did no better is encouraging. 

But humans are no better than machines 
at eliminating bias, notes mathematician 
Cathy O’Neil, founder of the risk consult- 
ing and auditing firm O’Neil Risk Consult- 
ing & Algorithmic Auditing in New York 
City. Dressel’s study, for example, found 
that people were just as likely as COM- 
PAS to overstate rearrest risks for black 
defendants and understate risks for white 
defendants. That’s troubling, given that 
similar algorithms are increasingly influ- 
encing not just court decisions, but also 
loan approvals, teacher evaluations, and 
even whether child abuse charges are in- 
vestigated by the state. “People get awed by 
mathematical sophistication,” O’Neil says, 
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“put it’s mostly a distraction.” 
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NEWS 


THE BELIEVER 


How a Mormon lawyer transformed Mesoamerican 
archaeology—and ended up losing his faith 


By Lizzie Wade, in San Cristébal de las Casas, Mexico 


homas Stuart Ferguson lay in his 
hammock, certain that he had 
found the promised land. It had 
been raining for 5 hours in his 
camp in tropical Mexico on this 
late January evening in 1948, and 
his three campmates had long 
since drifted off to sleep. But 
Ferguson was vibrating with ex- 
citement. Eager to tell someone what he 
had seen, he dashed through the down- 
pour to retrieve paper from his supply bag. 
Ensconced in his hammock’s cocoon of mos- 
quito netting, he clicked on his flashlight 
and began to write a letter home. 

“We have discovered a very great city here 
in the heart of ‘Bountiful’ land? Ferguson 
wrote. According to the Book of Mormon, 
Bountiful was one of the first areas settled by 
the Nephites, ancient people who supposedly 
sailed from Israel to the Americas around 
600 B.C.E. Centuries later, according to the 
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scripture, Jesus appeared to the Nephites in 
the same region after his resurrection. Mor- 
mons like Ferguson were certain that these 
events had happened in the ancient Ameri- 
cas, but debates raged over exactly how 
their sacred lands mapped onto real-world 
geography. The Book of Mormon gave only 
scattered clues, speaking of a narrow isth- 
mus, a river called Sidon, and lands to the 
north and south occupied by the Nephites 
and their enemies, the Lamanites. 

After years of studying maps, Mormon 
scripture, and Spanish chronicles, Ferguson 
had concluded that the Book of Mormon took 
place around the Isthmus of Tehuantepec, the 
narrowest part of Mexico (see map, p. 266). 
He had come to the jungles of Campeche, 
northeast of the isthmus, to find proof. 

As the group’s local guide hacked a path 
through the undergrowth with his machete, 
that proof seemed to materialize before 
Ferguson’s eyes. “We have explored four days 
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and have found eight pyramids and many 
lesser structures and there are more at ev- 
ery turn, he wrote of the ruins he and his 
companions found on the western shore of 
Laguna de Términos. “Hundreds and pos- 
sibly several thousand people must have 
lived here anciently. This site has never been 
explored before.” 

Ferguson, a lawyer by training, did go on 
to open an important new window on Meso- 
america’s past. His quest eventually spurred 
expeditions that transformed Mesoamerican 
archaeology by unearthing traces of the 
region’s earliest complex societies and ex- 
ploring an unstudied area that turned out 
to be a crucial cultural crossroads. Even to- 
day, the institute he founded hums with re- 
search. But proof of Mormon beliefs eluded 
him. His mission led him further and fur- 
ther from his faith, eventually sapping him 
of religious conviction entirely. Ferguson 
placed his faith in the hands of science, not 
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Stela 5 from Izapa in Mexico—an early site first extensively excavated by New World Archaeological Foundation 
archaeologists—shows a mythical tree; some Mormons believe it reflects a prophetic dream from the Book of Mormon. 


realizing they were the lion’s jaws. 

But that night, lying in his hammock lis- 
tening to the rain and the occasional roar of 
a jaguar in the distance, Ferguson felt surer 
than ever that Mesoamerican civilizations 
had been founded by migrants from the 
Near East, just as his religion had taught 
him. Now, he thought, how would he con- 
vince the rest of the world? 


THE CHURCH OF JESUS CHRIST of Latter-day 
Saints (LDS) doesn’t take an official position 
on where the events in the Book of Mormon 
occurred. But the faithful have been trying 
to figure it out practically since 1830, when 
church founder Joseph Smith published 
what he said was a divinely inspired account 
of the ancient Americas. Smith said an angel 
had led him to buried ancient golden plates, 
which he dug up and translated into the 
Book of Mormon. Smith’s account of bur- 
ied wonders was one of many in the United 
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States at the time. As white settlers moved 
west, they encountered mounds filled with 
skeletons and artifacts, including beauti- 
ful pottery and ornaments. Newspapers, 
including those in Smith’s hometown of 
Palmyra, New York, buzzed with specula- 
tion about who the “mound builders” were 
and how they came by their refined culture. 
Many settlers, blinded by racism, concluded 
that the mound builders—now known to be 
indigenous farming societies—were a lost 
people who had been exterminated by the 
violent ancestors of Native Americans. The 
Book of Mormon, with its saga of righteous, 
white Nephites and wicked, dark-skinned 
Lamanites, echoed these ideas. 

The Book of Mormon also spoke of sprawl- 
ing ancient cities, none of which had been 
identified in the United States. So in the 
1840s, Mormons, including Smith him- 
self, took notice of a U.S. explorer’s best- 
selling accounts of visits to the ruins of 
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Mayan cities in Mexico and Guate- 
mala. In 1842, as editor of a Mor- 
mon newspaper, Smith published 
excerpts from a book about the ru- 
ins of the Mayan city of Palenque 
in Mexico, with the commentary: 
“Even the most credulous cannot 
doubt ... these wonderful ruins of 
Palenque are among the mighty 
works of the Nephites—and the 
mystery is solved.” 

But non-Mormons continued 
to doubt, and church authori- 
ties gradually retreated from ex- 
plicit statements about Book of 
Mormon locations. By the 1930s, 
when Ferguson learned about 
Mesoamerican civilizations as an 
undergraduate at the University 
of California (UC), Berkeley, the 
matter had been largely ceded to 
amateurs who pored over maps 
and the Book of Mormon looking 
for correspondences. 

Ferguson wasn’t impressed by 
their efforts. “The interested and 
inquiring mind of the modern 
investigator is not satisfied with 
explanations which are vague, un- 
sound, and illogical,’ he wrote in 
an article in a church magazine 
in 1941. By then he was a law stu- 
dent at UC Berkeley and intrigued 
by the idea of scientifically testing 
Smith’s revelation. In a later letter, 
he wrote, “It is the only Church on 
the face of the earth which can be 
subjected to this kind of investiga- 
tion and checking.” And in another, 
to the LDS leadership, he declared, 
“The Book of Mormon is either fake 
or fact. If fake, the [ancient] cities 
described in it are non-existent. If fact—as 
we know it to be—the cities will be there.” 


TALL AND HANDSOME, with a lawyer’s prac- 
ticed authority, Ferguson trusted that the 
tools of science could persuade the world of 
the truth of the Book of Mormon. Soon after 
he finished college, he began searching for 
clues in colonial documents that recorded 
some of Latin America’s indigenous tradi- 
tions. One, written around 1554 by a group 
of Kiche’ Mayan villagers in the Guatemala 
highlands, stated that their ancestors— 
“sons of Abraham and Jacob’—had sailed 
across a sea to reach their homeland. The 
Kiche’ were defeated by Spanish conquis- 
tadors in 1524, and the biblical references 
were likely the product of contact with 
Catholic priests, who enthusiastically con- 
verted allies and former foes alike. 

But Ferguson, who had grown up in a 
Mormon family in Idaho, eagerly took such 
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Ferguson’s holy land 


His quest spurred digs in central and coastal Chiapas in Mexico, 
previously overlooked in favor of Olmec and Mayan lands. 
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syncretism as proof that Israelites had once 
settled in the Americas. He was also taken 
by the myth of Quetzalcéatl, the feathered 
serpent deity that some colonial priests de- 
scribed as a bearded white man. Ferguson 
concluded that he was Jesus, appearing in 
Bountiful after his resurrection just as the 
Book of Mormon recorded. His library re- 
search spurred his first hunt for archaeo- 
logical evidence, in Campeche in 1948. 

Ferguson realized, however, that colo- 
nial sources represented circumstantial 
evidence at best. Nor was it enough to 
find ruins of past civilizations in more 
or less the right location, as he had done 
in Campeche. To persuade and convert 
outsiders—a priority for Mormons—he 
sought objects mentioned in the Book of 
Mormon that archaeologists hadn’t found 
in Mesoamerica: horses, wheeled chariots, 
steel swords, and, most important, Hebrew 
or Egyptian script. “The final test of our 
views of Book of Mormon geography will 
be archaeological work in the ground it- 
self,’ Ferguson wrote in 1951 to his friend 
J. Willard Marriott, the wealthy founder of 
the Marriott hospitality chain and a power- 
ful figure in the church. 
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Ferguson’s idea that Mesoamerican soci- 
eties were seeded by Western ones is widely 
recognized as racist today. But it fit right 
into the archaeological thinking of the time, 
when Mesoamerican archaeologists were 
consumed by the question of whether civi- 
lizations had evolved independently in the 
Americas or had roots elsewhere. “In the 
1940s and 1950s, these were the questions 
everyone was investigating,’ says Robert 
Rosenswig, an archaeologist at the State 
University of New York (SUNY) in Albany. 

Ferguson never received a formal educa- 
tion in archaeology. He practiced law to sup- 
port his growing family—he eventually had 
five children—as well as his research. But in 
1951, he recruited leading archaeologists to 
explore the origin of Mesoamerican civiliza- 
tion as part of a new institution, the New 
World Archaeological Foundation (NWAF). 
First on board was renowned researcher 
Alfred Kidder of Harvard University and the 
Carnegie Institution for Science in Wash- 
ington, D.C. Kidder thought Mesoamerican 
civilizations had developed independently, 
but he and Ferguson had met at a museum 
in Guatemala City in 1946 and struck up 
a correspondence. 
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Kidder “is recognized as the _ best 
[Mesoamerican] archaeologist of the 
20th century,” says archaeologist John Clark 
of Brigham Young University (BYU) in 
Provo, Utah, who directed NWAF from 1987 
to 2009. To get Kidder on the project, Clark 
says, “There’s no question that Ferguson had 
to be some charismatic guy.’ Also recruited 
was Gordon Ekholm, an anthropologist at 
the American Museum of Natural History 
in New York City, who thought that Meso- 
american civilizations had their roots in ad- 
vanced Asian cultures. 

Their timing was good. Radiocarbon dat- 
ing had just been invented, and Ferguson im- 
mediately recognized its potential for tracing 
the origins of Mesoamerican cultures. “This 
is the greatest development since the begin- 
ning of archaeology,’ he wrote to LDS leader- 
ship. “I am of the personal opinion that the 
Lord inspired [radiocarbon dating] that it 
might be used effectively in connection with 
the Book of Mormon.” 

Yet the first years of NWAF were a des- 
perate scramble for money. Ferguson con- 
tributed thousands himself and raised 
funds from wealthy Mormons and the 
audiences of his lectures about Book of 
Mormon geography. In 1952, NWAF man- 
aged to send a handful of U.S. and Mexican 
archaeologists to survey the drainage basin 
of the Grijalva River in Tabasco and Chi- 
apas, which Ferguson believed to be the 
Book of Mormon’s River Sidon. 

By this point, Ferguson had become more 


“The Book of Mormon is 
either fake or fact. If fake, 
the [ancient] cities described 
in it are non-existent. If 
fact—as we know it to be—the 
cities will be there.” 


Thomas Stuart Ferguson, 
ina 1958 letter 


discerning about time periods than he had 
been in the jungles of Campeche. The ruins 
he found there were likely Classic or post- 
Classic Mayan, from between 250 C.E. and 
the Spanish conquest—much too late to be 
Mesoamerica’s earliest civilization or the 
period mentioned in the Book of Mormon, 
believed to be about 2200 B.C.E. to 400 C.E. 
“We'll never solve pre-Maya origins by dig- 
ging up more Mayas,” Ferguson wrote to 
Kidder in April 1953. They needed Formative 
period sites, dating from about 2000 B.C.E. 
to 200 C.E., roughly matching the dates as- 
sociated with the Book of Mormon. 
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In May 1953, Ferguson arrived in Chiapas 
to lend a hand. “He was rather alarmed that 
we hadn’t found anything notable, because 
he felt he had to have something pretty 
spectacular to go and get more money for 
another year,’ recalls John Sorenson, then 
a master’s student in archaeology at BYU 
(and a Mormon). To jump-start the search, 
Ferguson chartered a small plane, and he 
and Sorenson flew over the lush lowlands of 
central Chiapas. Fifteen kilometers southeast 
of the state capital, Tuxtla Gutiérrez, they 
spotted the mounds and plazas of the an- 
cient site of Chiapa de Corzo—which was 
then unknown to archaeologists. Later 
NWAF excavations dated the city to the 
Formative period. 

Back on the ground, Ferguson and 
Sorenson set out by jeep for a 10-day 
survey to see what else they could 
find. “We'd go from site to site, town 
to town, asking ‘Are there any ruins 
around here?” says Sorenson, who 
went on to receive a Ph.D. in anthro- 
pology from UC Los Angeles (UCLA) 
and is now a professor emeritus at BYU. 
Ferguson also asked locals whether they 
had found figurines of horses—unknown 
in ancient Mesoamerica—or sources of iron 
ore, which Sorenson found naive. But his own 
archaeological training paid off, and at some 
sites he was able to identify the polished, 
monochrome pottery and hand-sculpted, 
irregular human figurines of the Forma- 
tive period, so different from the intricate 
but standardized figurines the Classic Maya 
had made from molds. In all, Sorenson and 
Ferguson surveyed 22 sites on that journey 
and collected an astounding number of 
Formative artifacts. “In my humble opinion 
there is little or no question about it—they 
are Nephite making,’ Ferguson wrote to his 
church funders. 

In 1954, LDS authorities granted NWAF 
$250,000 for 5 years of work. Intensive ex- 
cavations at Chiapa de Corzo uncovered 
stone pyramids and tombs, and a wealth 
of pottery that impressed University of 
Pennsylvania anthropologist John Alden 
Mason, then working with NWAF. “Since pre- 
Classic pottery is not very common any- 
where, and that of this region is entirely new, 
it is of course a very great scientific contri- 
bution,” Mason wrote to Ferguson. Eventu- 
ally, archaeologists reported that the site was 
settled around 1200 B.C.E., likely by people 
connected to the Olmec, an early civiliza- 
tion that dominated the gulf coast of Mexico 
from 1200 B.C.E. to 400 B.C.E., centuries be- 
fore the Classic Maya arose. 

Then, in the early 1960s, NWAF archaeo- 
logists became the first to extensively exca- 
vate at Izapa, near the Chiapas coast and 
the Guatemalan border. They were drawn to 
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the site in part because of a monument that 
apparently depicts a myth involving a tree; 
Ferguson’s friend and founder of BYU’s 
archaeology department, M. Wells Jakeman, 
argued that the carving shows visions re- 
ceived in a dream by the Mormon prophet 
Lehi. NWAF archaeologists, some of whom 
were Mormon, later soundly rebuffed that 


Aritual figurine from the site of 
Los Horcones is scanned at New World 
Archaeological Foundation headquarters. 


interpretation. But Izapa turned out to be a 
key site in the Soconusco, the Pacific coast 
region from which every Mesoamerican po- 
litical power, from the Olmec in 1200 B.C.E. 
to the Aztec empire in the early 1500s C.E., 
sourced key luxury goods such as cacao and 
quetzal feathers. NWAF spearheaded excava- 
tions throughout this region. Pottery finds 
and dates from Izapa and elsewhere formed 
the basis of the ceramic chronologies for the 
Formative period that are still used by every 
archaeologist working in central and coastal 
Chiapas today. 

“They were working in a part of Meso- 
america that was really unknown,’ says 
Michael Coe, an influential Mesoamerican 
archaeologist and professor emeritus at 
Yale University who, at the time, was sur- 
veying Formative sites just over the border 
in Guatemala. “NWAF put it on the map.” 

But even as NWAF grew in scientific stat- 
ure, and was finally assured continued exis- 
tence when BYU took it over in 1961, Ferguson 
was quietly becoming frustrated. The smok- 
ing gun he had been certain he would find— 
Egyptian or Hebrew script—proved elusive. 
He once had promised that archaeological 
evidence for the Book of Mormon would be 
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found within 10 years of NWAF starting ex- 
cavations. But in 1966 he wrote, “My number 
one goal of establishing that Christ appeared 
in Mexico following the crucifixion will never 
be achieved until significant ancient manu- 
script discoveries are made. I hope it happens 
during our lifetimes.” 

When an ancient manuscript discov- 
ery did come, however, it was from a dif- 
ferent quarter of the world—and it shook 
Ferguson’s faith to its core. 


IN THE SUMMER OF 1835, Joseph Smith had 
received a curious visitor in Kirtland, 
Ohio, then the headquarters of his 
burgeoning LDS church: a traveling 
showman, with four Egyptian mum- 
mies and some hieroglyphic texts in 
tow. The church bought the mum- 
mies and texts, and Smith said he 
translated the hieroglyphics, result- 
ing in the Book of Abraham, which 
lays out Smith’s cosmic vision of 
the afterlife. (Although Egyptian hi- 
eroglyphics had been deciphered in 
France in 1822 with the help of the Ro- 
setta Stone, the news had barely reached 
US. shores.) As Smith and his followers 
moved around the Midwest, often fleeing 
angry mobs, they carried the mummies and 
papyri with them. After Smith’s death at the 
hands of one of those mobs in Nauvoo, II- 
linois, they were sold by his family. 

The fate of the mummies remains a mys- 
tery. But in 1966, a University of Utah profes- 
sor examining artifacts at the Metropolitan 
Museum of Art in New York City came across 
ll Egyptian papyri with an 1856 certificate 
of sale signed by Smith’s widow, Emma. The 
professor realized he was looking at the Book 
of Abraham papyri, and the documents were 
returned to the Mormon church. 

Ferguson learned the news from a front- 
page article in the newspaper Deseret News 
on 27 November 1967. Within days, he wrote 
to a friend in the church leadership, begging 
to know whether the papyri would be stud- 
ied. Hearing that no studies were planned, 
Ferguson, as ever, took matters into his own 
hands. He received photos of the documents 
from the church and hired Egyptologists at 
UC Berkeley to translate them. He told the 
scholars nothing about the religious signifi- 
cance of the papyri. “He was conducting a 
clearly blind test,” Clark says. 

The results started coming in 6 weeks 
later. “I believe that all of these are spells 
from the Egyptian Book of the Dead,” 
UC Berkeley Egyptologist Leonard Lesko 
wrote to Ferguson. Three other scholars 
independently gave Ferguson the same 
result: The texts were authentic ancient 
Egyptian, but represented one of the most 
common documents in that culture. 
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After decades of stressing the importance 
of the scientific method and using it to 
shore up his own faith, Ferguson now found 
himself at its mercy. “I must conclude that 
Joseph Smith had not the remotest skill in 
things Egyptian-hieroglyphics,’ he wrote to 
a fellow doubting Mormon in 1971. What’s 
more, he wrote to another, “Right now I am 
inclined to think that all of those who claim 
to be ‘prophets, including Moses, were with- 
out a means of communication with deity.” 

This doubt ultimately spread _ to 
Ferguson’s archaeological quest. In 1975, 
he submitted a paper to a sym- 
posium about Book of Mormon 
geography outlining the failure of 
archaeologists to find Old World 
plants, animals, metals, and scripts 
in Mesoamerica. “The real impli- 
cation of the paper,’ he wrote in 
a letter the following year, “is that 
you can’t set Book of Mormon 
geography down  anywhere— 
because it is fictional.” 

Although open about his doubts 
in his private letters, Ferguson didn’t 
discuss his loss of faith with his fam- 
ily. He continued attending church, 
singing in the choir, and even giving bless- 
ings. “[Mormons] are so immersed in that 
culture ... [that] to lose your faith, it’s like 
you're being expelled from Eden,” Coe says. 
“T felt sorry for him.” 

Ferguson continued to visit Mexico and 
from time to time stopped by NWAF head- 


“Tam of the personal opinion 
that the Lord inspired 
[radiocarbon dating] that 
it might be used effectively 
in connection with 
the Book of Mormon.” 


Thomas Stuart Ferguson, 
ina 1956 letter 


quarters in Chiapas, where he spoke frankly 
with Clark in 1983. “He resented that he 
spent so much time trying to prove the 
Book of Mormon. He said it was a fraud,” 
remembers Clark, who is Mormon. The 
next month, Ferguson died of a heart attack 
while playing tennis. He was 67. 


ON A RECENT AFTERNOON AT NWAF headquar- 
ters here, scholars wander among buildings, 
sheltered patios, and a courtyard brim- 
ming with flowers and citrus trees. UCLA 
archaeologist Richard Lesure sorts through 
ceramics he excavated 27 years ago at Paso 
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de la Amada on the Chiapas coast, home to 
Mesoamerica’s first known ball court and 
elite residences. With NWAF support, Lesure 
has spent nearly 3 decades studying why 
mobile, egalitarian hunter-gatherers settled 
down here and created the oldest complex 
society in Mesoamerica around 1900 B.C.E., 
before even the Olmec rose to power. 


At the New World Archaeological Foundation, 
Richard Lesure studies artifacts from 
Mesoamerica’s earliest complex society. 


Upstairs, Claudia Garcia-Des Lauriers, an 
archaeologist at California State Polytech- 
nic University in Pomona, watches as an 
undergraduate student carefully positions 
an opossum-shaped ceramic whistle in the 
thin red laser beams of a 3D scanner. The 
researchers are creating a digital version of 
the ritual object, which Garcia-Des Lauriers 
discovered at the Classic period site of Los 
Horcones on the Chiapas coast. Meanwhile, 
in the backyard, Clark leads an impromptu 
flint knapping lesson, using obsidian nod- 
ules strewn about the lawn. 

“It’s such a stimulating place to work,” 
says Janine Gasco, an archaeologist at Cali- 
fornia State University in Dominguez Hills, 
who began working with NWAF in 1978. 
“It’s been a force in my life.” 

In the years after Ferguson drifted away 
from the church and the foundation, NWAF 
continued to lead excavations, fund gradu- 
ate students, publish an impressive amount 
of raw data, and store archaeological col- 
lections. Thanks to its work, a region that 
once seemed an archaeological backwater 
compared with the nearby Classic Mayan 
heartland in the Yucatan, Guatemala, and 
Belize has been revealed as the birthplace of 
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Mesoamerican civilization and an economic 
and cultural hot spot, where people from all 
over the region crossed paths. “We wouldn’t 
know anything about [central and coastal] 
Chiapas if it wasn’t for [NWAF],” Garcia-Des 
Lauriers says. 

“Their work set the stage for everything 
I’ve done,” says SUNY Albany’s Rosenswig, 
who led recent excavations at Izapa to 
study the origins of urban life in Meso- 
america. When his graduate student Rebecca 
Mendelsohn, now a postdoc at the Smith- 
sonian Tropical Research Institute in 
Panama City, excavated in Izapa in 2014, 
NWAF’s original map of its mounds 
and monuments served as a vital 

field reference (Science, 16 May 2014, 

p. 684). “P’ve been surprised at how 

sound the work from the 1960s still 

is,’ she says. 

NWAF is still run by BYU, which 
means its funding comes from the 
Mormon church and all its directors 

have been Mormons. But aside from 

a ban on coffee at headquarters, the 
archaeologists who work here barely 
notice its religious roots. “There aren’t 

conversations about religion,” Gasco says. 
“The archaeological community has a lot of 
respect for the work done here.” 

Ferguson had hoped the Chiapas coast 
would turn out to be a crossroads not just 
for Mesoamerica, but the world. But the 
more NWAF and its collaborators excavated 
and analyzed sites in the region, the more 
they confirmed that Mesoamerican civiliza- 
tion sprang up from entirely New World or- 
igins. For archaeologists today, that makes 
the field all the more exciting. “That’s one 
of the most amazing things about study- 
ing Mesoamerican archaeology—it’s one 
of a half-dozen or so cases of independent 
development of agriculture, development 
of complexity, development of cities,’ 
Rosenswig says. 

It is hard to know whether Ferguson 
would have shared that excitement. For all 
his trust in science, his goal was to serve his 
faith. Some believing Mormons still read his 
books and trust his early, enthusiastic ideas 
about Mesoamerica. Others who came to 
doubt their religion also found hope in his 
story. His loss of faith gave them conviction 
and strength as they began their own journey 
down a difficult road, as shown by many who 
wrote him anguished letters in his later years. 

But it is his scientific legacy, long un- 
recognized, that is perhaps most signifi- 
cant. “Facts are facts and truth is truth,’ 
Ferguson once wrote about’ the 
archaeological evidence for the Book of 
Mormon that he was sure was about to be 
discovered in southern Mexico. His belief in 
that principle never wavered. ® 
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Assessing nature's contributions to people 


Recognizing culture, and diverse sources of knowledge, can improve assessments 


By Sandra Diaz, Unai Pascual, Marie Stenseke, Berta Martin-Lépez, Robert T. Watson, Zsolt Molnar, Rosemary Hill, Kai M. A. Chan, 
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Sebsebe Demissew, Gunay Erpul, Pierre Failler, Carlos A. Guerra, Chad L. Hewitt, Hans Keune, Sarah Lindley, Yoshihisa Shirayama 


major challenge today and into the fu- 
ture is to maintain or enhance benefi- 
cial contributions of nature to a good 
quality of life for all people. This is 
among the key motivations of the In- 
tergovernmental Science-Policy Plat- 
form on Biodiversity and Ecosystem Services 
(IPBES), a joint global effort by governments, 
academia, and civil society to assess and pro- 
mote knowledge of Earth’s biodiversity and 
ecosystems and their contribution to human 
societies in order to inform policy formula- 
tion. One of the more recent key elements of 
the IPBES conceptual framework (J) is the 
notion of nature’s contributions to people 
(NCP), which builds on the ecosystem ser- 
vice concept popularized by the Millennium 
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Ecosystem Assessment (MA) (2). But as we 
detail below, NCP as defined and put into 
practice in IPBES differs from earlier work 
in several important ways. First, the NCP ap- 
proach recognizes the central and pervasive 
role that culture plays in defining all links be- 
tween people and nature. Second, use of NCP 
elevates, emphasizes, and operationalizes the 
role of indigenous and local knowledge in un- 
derstanding nature’s contribution to people. 
The broad remit of IPBES requires it to 
engage a wide range of stakeholders, span- 
ning from natural, social, humanistic, and 
engineering sciences to indigenous peoples 
and local communities in whose territories 
lie much of the world’s biodiversity. Being an 
intergovernmental body, such inclusiveness 
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is essential not only for advancing knowledge 
but also for the political legitimacy of assess- 
ment findings (3). 


FROM SERVICES TO CONTRIBUTIONS 

NCP are all the contributions, both positive 
and negative, of living nature (diversity of 
organisms, ecosystems, and their associated 
ecological and evolutionary processes) to 
people’s quality of life (4). Beneficial contri- 
butions include, for example, food provision, 
water purification, and artistic inspiration, 
whereas detrimental contributions include 
disease transmission and predation that 
damage people or their assets. Many NCP 
may be perceived as benefits or detriments 
depending on the cultural, socioeconomic, 
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Nature inthe form of a living root bridge 
in Meghalaya, India, contributes to people 
by connecting both sides of the river. | 


temporal, or spatial context. For example, 
some carnivores are recognized—even by the 
same people—as beneficial for control of wild 
ungulates but as harmful because they may 
attack livestock. 

At first inspection, the notion of NCP does 
not appear to differ much from the original 
MA definition of ecosystem services (2), 
which was broad and contemplated links 
to many facets of well-being. However, the 
detailed conceptualization and the practical 
work on ecosystem services following on the 
MA were dominated by knowledge from the 
natural sciences and economics. The natu- 
ral sciences, and ecology in particular, were 
used to define “ecological production func- 
tions” to determine the supply of services, 
conceptualized as flows stemming from 
ecosystems (stocks of natural capital) (5). 
Economics was used to estimate the mone- 
tary value of those ecosystem services flows 
so as to identify trade-offs among them and 
their impacts on well-being. Aided by ecol- 
ogy and economics having readily available 
tools, the ecosystem services approach de- 
veloped into a vibrant research field, influ- 
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enced policy discourse, and advanced the 
sustainability agenda. 

However, this predominantly stock-and- 
flow framing of people-nature relationships 
largely failed to engage a range of perspec- 
tives from the social sciences (6), or those 
of local practitioners, including indigenous 
peoples. This reinforced a mutual alienation 
process in which MA-inspired studies and 
policies became increasingly narrow, which 
in turn led to voluntary self-exclusion of dis- 
ciplines, stakeholders, and worldviews. As a 
consequence, the ecosystem services research 
program proceeded largely without benefit- 
ing from insights and tools in social sciences 
and humanities. For example, the unpacking 
and valuation of some “cultural ecosystem 
services” not readily amenable to biophysical 
or monetary metrics have lagged behind (7), 
and so has their mainstreaming into policy. 
In addition, as diverse disciplines and stake- 
holders remained at the margins, the initial 
skepticism toward the ecosystem services 
framework turned into active opposition, of- 
ten based on the perceived risks of commodi- 
fication of nature (8) and associated social 
equity concerns (9). 

The need to be inclusive, both in terms of 
the strands of knowledge incorporated and 
representation of worldviews, interests and 
values (10), required IPBES to move to using 
NCP. Although still rooted in the MA ecosys- 
tem services framework (fig. S1), this new ap- 
proach has the potential to firmly embed and 
welcome a wider set of viewpoints and stake- 
holders. It should also be less likely to be 
subsumed within a narrow economic (such 
as market-based) approach as the mediating 
factor between people and nature. 


AN INCLUSIVE SYSTEM 
The NCP approach explicitly recognizes 
that a range of views exist. At one extreme, 
humans and nature are viewed as distinct 
(2); at the other, humans and nonhuman 
entities are interwoven in deep relation- 
ships of kinship and reciprocal obligations 
(11, 12). In addition, the way NCP are copro- 
duced by nature and people is understood 
through different cultural lenses. For in- 
stance, coproduction of food in high-diver- 
sity agriculture can be framed as a process 
that combines a set of biological and tech- 
nological inputs aimed at maximizing coex- 
istence between useful plants and animals 
in order to achieve higher yields. 
Alternatively, coproduction of food can be 
seen as a “practice of care” (12, 13) through 
social relationships and connection with 
spiritual entities. Therefore, we propose two 
lenses through which to view NCP: a gen- 
eralizing perspective and a context-specific 
perspective. Although presented here as 
extremes, these two perspectives are often 
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blended and interwoven (J4), enabling co- 
construction of knowledge among disciplines 
and knowledge systems (fig. S2). 


Generalizing perspective 

Typical of the natural sciences and econom- 
ics, this perspective (represented in green 
at the bottom of fig. S2) is fundamentally 
analytical in purpose; it seeks a universally 
applicable set of categories of flows from 
nature to people. Distinction between them 
is often sharp, and agency is acknowledged 
only in the case of people. NCP categories 
can be seen at finer or coarser resolution 
but can still be organized into a single, self- 
consistent system. 

We identify 18 such categories for report- 
ing NCP within the generalizing perspec- 
tive, organized in three partially overlapping 
groups: regulating, material, and nonmate- 
rial NCP (fig. S3 and table S1), defined ac- 
cording to the type of contribution they make 
to people’s quality of life. 

Material contributions are substances, ob- 
jects, or other material elements from nature 
that directly sustain people’s physical exis- 
tence and material assets. They are typically 
physically consumed in the process of being 
experienced—for example, when organisms 
are transformed into food, energy, or materi- 
als for ornamental purposes. 

Nonmaterial contributions are nature's ef- 
fects on subjective or psychological aspects 
underpinning people’s quality of life, both in- 
dividually and collectively. Examples include 
forests and coral reefs providing opportuni- 
ties for recreation and inspiration, or par- 
ticular animals and plants being the basis of 
spiritual or social-cohesion experiences. 

Regulating contributions are functional 
and structural aspects of organisms and eco- 
systems that modify environmental condi- 
tions experienced by people and/or regulate 
the generation of material and nonmaterial 
contributions. Regulating contributions fre- 
quently affect quality of life in indirect ways. 
For example, people directly enjoy useful or 
beautiful plants but only indirectly benefit 
from the soil organisms that are essential for 
the supply of nutrients to such plants. 

Culture permeates through and across all 
three broad NCP groups (fig. S1) rather than 
being confined to an isolated category (the 
“cultural ecosystem services” category in the 
MA framework). In addition, the three broad 
groups—rather than being independent 
compartments, as typically framed within 
the ecosystem services approach—explicitly 
overlap. We distinguish them for practical 
reporting reasons, acknowledging that many 
of the 18 NCP categories do not fit squarely 
into a single group (fig. S3). For example, 
food is primarily a material NCP because 
calories and nutrients are essential for physi- 
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cal sustenance. However, food is full of sym- 
bolic meaning well beyond physical survival. 
Indeed, nonmaterial and material contribu- 
tions are often interlinked in most, if not all, 
cultural contexts (7). 


Context-specific perspective 

This is the perspective typical, but not ex- 
clusive, of local and indigenous knowledge 
systems (represented in blue at the top of 
fig. $2). In local and indigenous knowledge 
systems, the production of knowledge typi- 
cally does not explicitly seek to extend or vali- 
date itself beyond specific geographical and 
cultural contexts (/4). Indeed, the context- 
specific perspective on NCP often tends to 
resist the scientific goal of attaining a univer- 
sally applicable schema. 

Although subdivision into internally con- 
sistent systems of categories is common in 
many local knowledge systems, a universally 
applicable classification—such as the one 
proposed in the generalizing perspective on 
NCP (table S1)—is not currently available and 
may be inappropriate because of cultural in- 
commensurability and resistance to univer- 
sal perspectives on human-nature relations. 
The context-specific perspective may instead 
present NCP as bundles that follow from dis- 
tinct lived experiences such as fishing, farm- 
ing, or hunting or from places, organisms, or 
entities of key spiritual significance, such as 
sacred trees, animals, or landscapes (1/7, 13). 

Providing space for context-specific per- 
spectives recognizes that there are multiple 
ways of understanding and categorizing re- 
lationships between people and nature and 
avoids leaving these perspectives out of the 
picture or forcing them into the 18 general- 
izing NCP categories. The NCP approach 
thus facilitates respectful cooperation across 
knowledge systems in the co-construction of 
knowledge for sustainability. 


NURTURING A PARADIGM SHIFT 

The NCP concept extends beyond the 
highly influential yet often contested no- 
tion of ecosystem services, incorporating 
a number of interdisciplinary insights and 
tools. Most of them were called for during 
the past decade (9, 10, 12, 14) but only now 
are enshrined explicitly in an environmen- 
tal assessment framework. 

The implementation of the NCP approach 
and its reporting categories (tables S1 and S2) 
is still in its infancy and is expected to be fully 
fledged only in the IPBES Global Assessment, 
but the NCP approach is already changing as- 
sessment procedures and their outcomes. For 
example, the ongoing IPBES regional assess- 
ments include an unprecedented effort to tap 
indigenous and local knowledge, from the 
literature and also from dialogues with indig- 
enous and local knowledge-holders, to which 
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they contributed information presented in 
their own narratives. In the Europe and Cen- 
tral Asia assessment, these narratives (75) 
revealed complex interactions between detri- 
mental (predation on livestock) and benefi- 
cial NCP (carcass removal or protection by 
shepherd/guard dogs) that were not consid- 
ered in previous national ecosystem assess- 
ments. This kind of evidence also enhanced 
the confidence about the status and trends 
of other NCP in cases in which the evidence 
based on published literature was scarce 
(such as for NCP “Supporting identities”). 
In this regional assessment, it was relatively 
easy to fit most narratives into the 18 catego- 
ries of the generalizing perspective on NCP. 
In assessing pollinators, pollination, and 
food production (6), the dialogue with 
local and indigenous knowledge-holders 
highlighted some NCP that were defined 
as practices of care gifted to people, such 
as fostering pollinator nesting resources 
in forests, totemic relationships requiring 
reciprocal obligations between people and 


“The NCP approach aims at 
... products that are ... more 
likely to be incorporated into 
policy and practice.” 


pollinators, and traditional governance 
that depends on ongoing presence of bees 
and butterflies in the landscape (table S2) 
(13). These context-specific NCP do not fit 
easily in the 18 generalizing NCP categories. 
Nevertheless, these knowledge sources un- 
derpinned innovative strategic responses 
highlighted in the main messages to pol- 
icy-makers that were agreed on among all 
the member countries of IPBES (J6): to 
strengthen traditional governance and ten- 
ure systems that support pollinators, which 
are critical in many places where these 
systems are being eroded through rapid 
industrialization. 

These examples illustrate how the inter- 
weaving of epistemologically diverse lines 
of evidence (J4) about specific subjects can 
result in richer solutions for people and na- 
ture, even within the context of large-scale 
assessments. But regardless of the outcomes 
of the assessments, the consideration of dif- 
ferent knowledge systems—and the fact that 
generalizing, context-specific, and mixed 
perspectives are considered as equally use- 
ful—matters in terms of making IPBES pro- 
cedures and outcomes more equitable. This 
should help overcome existing power asym- 
metries between western science and in- 
digenous and local knowledge, and among 
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different disciplines within western science, 
in the science-policy interface. The NCP ap- 
proach aims at coming up with products 
that are better and also more legitimate and 
therefore more likely to be incorporated into 
policy and practice. 

In addition to assessments, environ- 
mental governance and associated policies 
would likely increase their effectiveness 
and social legitimacy by drawing on the 
NCP approach. This is because it facilitates 
much more than previous framings the 
connection with rights-based approaches 
to conservation and sustainable use of na- 
ture and their implications for quality of 
life. The presence of multiple worldviews 
and diverse ways of expressing them in the 
wording of the Convention on Biological 
Diversity’s strategic plan for biodiversity 
and specific objectives, such as the Aichi 
Targets, further illustrates how important 
inclusive framings are to the broad political 
legitimacy of these international objectives 
and their implementation instruments. 
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PERSPECTIVES 


CHEMICAL ENGINEERING 


The two 3D-printed cylindrical 
static mixers (diameter of 6 mm 
and length of 150 mm) were 
coated with metal catalysts for 
hydrogenation reactions (10). 


The art of manufacturing molecules 


Additively manufactured monolithic reactors allow on-demand synthesis of drug molecules 


By Christian H. Hornung 


he way we manufacture many of the 
products used in everyday life, such 
as the ingredients in shampoo, the 
plastic components of smartphones, 
the vitamins and pharmaceuticals 
we take, and the packaging that all of 
them come in, has not changed in a signifi- 
cant way over the last hundred years. Argu- 
ably, these methods of manufacturing are 
even older and were already applied in the 
first large-scale chemical processes in the 
19th century, in which new products such as 
vulcanized rubber, synthetic dyes, or indus- 
trial fertilizers were first produced on scales 
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unknown to society at the time. The devel- 
opment of these industrial processes was 
driven by the benefits of economy of scale, 
with the aim of centralizing, optimizing, 
maximizing, and integrating production. In 
recent years, efforts were made by a series 
of research groups to reverse this trend and 
decentralize, miniaturize, and even digitize 
chemical manufacturing. On page 314 of 
this issue, Kitson et al. (1) report the syn- 
thesis of active pharmaceutical ingredients 
(APIs) on demand in a three-dimensional 
(3D)-printed, miniaturized reactor cascade. 
A complete multistep synthesis of the mus- 
cle relaxant baclofen was developed and dig- 
itized for remote bench-scale manufacture. 
Chemical production, whether for pet- 
rochemicals or pharmaceuticals, requires 
specialized equipment and facilities, and its 
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hazardous nature demands highly trained 
operators. Centralizing a series of chemical 
operations within one large dedicated plant 
is generally a much more sensible approach 
than dividing them up between small, dis- 
tributed manufacturers or single-person 
craftsmanship. However, centralized pro- 
duction also has its downsides, such as the 
distance to the point of use and the associ- 
ated transport and storage issues, which are 
aconcern for high-value products with a lim- 
ited shelf life (e.g., many pharmaceuticals). 
Recent technological advances are allowing 
us to challenge this old way of thinking, and 
to propose new answers to the question of 
how and where the molecules and materi- 
als we use day in, day out should be made. 
Miniaturization and additive manufacture 
are two key elements of these efforts to de- 
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centralize chemical production, and Kitson 
et al.’s innovative method of synthesizing 
APIs in a 3D-printed, miniaturized reactor 
incorporates both of these concepts. 

The miniaturization of chemical synthe- 
sis and analysis, and of the apparatus and 
devices used, has changed laboratory meth- 
odologies during the past few decades. Mini- 
and microfabrication techniques introduced 
in the 1980s and 1990s have led to the de- 
velopment of new plate- and chip-type con- 
tinuous-flow microreactors that can better 
control heat and mass transfer, and in turn 
improve conversion, efficiency, or safety of 
a chemical reaction by orders of magnitude 
compared with conventional reactor geome- 
tries (2-4). Microreactor (or flow-chemistry) 
technology makes ultrafast, highly exother- 
mic reactions controllable and practicable 
on laboratory and industrial scales. Device 
miniaturization has also changed the way 
chemicals are analyzed. Lab-on-a-chip de- 
vices made from glass, plastic, or even paper 
can simplify, accelerate, and reduce the cost 
of clinical diagnostics and remote chemical 
analysis (5-7). 

Additive manufacturing techniques, such 
as 3D printing of polymers, ceramics, or met- 
als, overcome limitations of conventional 
subtractive manufacturing methods, result- 
ing in nearly complete freedom of design. 
Recently, chemical engineers, materials sci- 
entists, and others have utilized this poten- 
tial of 3D printing for building more efficient 
chemical reactors with geometries that are 
otherwise not accessible. For example, the 
use of 3D-printed metal structures as more 
efficient and tailor-made mixers, catalysts, or 
both in the synthesis of organic compounds 
is currently being investigated by several 
research groups (see the photo) (8-10). The 
ability to design the reactor geometry specifi- 
cally for a given fluidic or chemical applica- 
tion, and the ability to rapidly prototype the 
device, can give 3D-printed “reactorware” a 
pivotal advantage over traditional methods. 

Kitson et al. developed a blueprint that 
digitizes the bench-scale synthesis of APIs 
into a sequence of batch operations con- 
ducted in a monolithic polymer reactor. 
The originality of this approach is twofold. 
First, it combines several processing steps, 
including four reactions, two liquid-liquid 
extractions, and a set of evaporations and 
filtrations inside one tailor-made reactor 
device containing several interconnected 
modules. Second, said device can be printed 
on demand for use in a distributed setting. 
All that is needed is access to a 3D printer, 
a library of common chemical starting ma- 
terials, and a set of instructions (i.e., design 
files and chemical synthesis protocols), 
and in principle, it would be possible to 
synthesize small amounts of APIs or other 
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complex, high-value compounds with a lim- 
ited shelf life anywhere. Such an approach 
is strongly aligned with current efforts by 
other researchers working on distributed 
and remote manufacturing of chemicals and 
pharmaceuticals (11), which usually have 
been executed in larger and more intensi- 
fied processing equipment. The concept of 
designing the operations such that they can 
be housed inside mobile shipping contain- 
ers has been adopted by several industrial 
research and development groups in recent 
years, such as the integrated chemical plants 
developed by the F? Factory consortium (12). 

By demonstrating the multistep synthe- 
sis of baclofen in this integrated, benchtop 
device, the door has been opened to mak- 
ing complex molecules, such as APIs, on 
demand in nontraditional manufacturing 
environments such as hospitals or even doc- 
tors’ offices, bringing manufacturing closer 
to the point of use. These manufacturing 
scenarios might also include remote set- 
tings, synthesis of personalized medicines, 
small-scale production of abandoned phar- 
maceuticals, or even space missions. 

For this technology to be put into practice, 
a range of regulatory hurdles would have to 
be considered. How would quality control 
and chemical analysis be approached in 
remote settings? Furthermore, how do the 
costs of remote on-demand synthesis com- 
pare with shipping, storage, and inventory 
of traditionally manufactured drugs? None- 
theless, new technological solutions for the 
manufacture of chemicals and pharmaceu- 
ticals in a decentralized setting are needed, 
and these may have economic, environ- 
mental, and societal benefits, particularly 
for rural communities and industries (77). 
Distributed manufacturing is a promising 
approach for sustainable and socially re- 
sponsible production of goods close to their 
point of use, and Kitson et al. are among 
the pioneers bringing us closer to making 
flexible and potentially movable, portable, 
or even printable minifactories a reality. 
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QUANTUM FLUIDS 


Quantum 
liquids 
get thin 


A mix of two bosonic 
particles develops 
attractive forces to create 
a quantum liquid 


By Igor Ferrier-Barbut and Tilman Pfau 


liquid exists when interactions that 
attract its constituent particles to 
each other are counterbalanced by 
a repulsion acting at higher densi- 
ties. Other characteristics of liquids 
are short-range correlations and the 
existence of surface tension (J). Ultracold 
atom experiments provide a privileged 
platform with which to observe exotic 
states of matter, but the densities are far 
too low to obtain a conventional liquid be- 
cause the atoms are too far apart to cre- 
ate repulsive forces arising from the Pauli 
exclusion principle of the atoms’ internal 
electrons. The observation of quantum liq- 
uid droplets in an ultracold mixture of two 


“in this configuration, 

the droplet does not expand 
like a gas would do 

but stays self-bound and 
behaves like a liquid.” 


quantum fluids is now reported on page 
301 of this issue by Cabrera et al. (2) and 
a recent preprint by Semeghini e¢ al. (3). 
Unlike conventional liquids, these liquids 
arise from a weak attraction and repulsive 
many-body correlations in the mixtures. 
In ordinary liquids, the attraction be- 
tween the constituents emerge from weak 
forces such as hydrogen bonds or van der 
Waals interactions, and the repulsion at 
higher density stems from the Pauli ex- 
clusion principle for electrons. The ultra- 
cold-atom samples that were studied were 
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Mixing up a quantum liquid 


Cabrera et al. and Semeghini et al. studied mixtures of two species of trapped, ultracold bosonic potassium 
atoms and show that they formed a quantum liquid. Single species have only repulsive interactions and form 
quantum gases, but the interaction between species is attractive and allows a liquid to form. 


Repulsion versus attraction 


=—— Attraction << Repulsion 


The mean-field (MF) approximation a 

predicts that energy (E) varies as n?’, Repulsion O1e@ Vos >) 
where nis density. Beyond the mean- Fe p2 @@ © ¢ ») 
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For a single atomic species, 

the ensemble MF energy is 

positive, and BMF corrections 

are weak. A gas forms that E 
expands in free space to 

minimize its energy. 
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Quantum liquid EMP 
When two types of atoms are , 
mixed, MF effects nearly cancel / 


out, creating a weak attraction 
that is counterbalanced by BMF 
corrections. A liquid forms at a 
particular density ng that 
minimizes energy. 


at about 100 nK, well below the freezing 
points of ordinary liquids, but had densi- 
ties about eight orders of magnitude lower 
than that of water. In this regime, atoms 
behave as coherent matter waves, inter- 
fering with each other, and in the case of 
so-called bosonic atoms, they form a Bose- 
Einstein condensate (BEC). 

Because BECs are very dilute, they exist 
mostly in the weakly interacting regime 
where the mean-field (MF) approximation 
is sufficient to calculate the energy of the 
condensate accurately. This approximation 
assumes that fluctuations are not relevant 
in the system. By properly taking into ac- 
count many-body effects stemming from 
quantum fluctuations, corrections to the 
energy beyond the mean-field approxima- 
tion (BMF) can be obtained. These correc- 
tions create a shift in energy that grows 
with interaction strength but is typically 
very small compared with the MF energy. 

However, this BMF energy shift exhibits 
a stronger dependence with density than 
the MF energy. In a large and stable sin- 
gle-species BEC, the MF energy is already 
positive so that both effects are repulsive 
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Sum of MF 
and BMF 


(see the figure). To obtain a balance be- 
tween attraction and repulsion, Petrov 
(4) suggested mixing two species of atoms 
together. Within each species, the atoms 
repel each other and form a BEC, but the 
interspecies interaction is attractive and 
slightly stronger. For these conditions, 
the MF energy is very small and negative 
and acts effectively as an attractive force, 
whereas the many-body BMF correction 
remains repulsive and sizable. The system 
then forms a self-bound liquid. 

The experiments of Cabrera et al. and 
Semeghini et al. trap potassium atoms of 
the bosonic isotope *°K. They prepared the 
atoms in two different states with the in- 
teratomic interactions tuned so that this 
liquid state would form. To keep the sam- 
ple from forming a gas, the two teams used 
specially designed optical traps to hold the 
droplet against gravity while allowing it to 
expel atoms in the horizontal plane. They 
show that in this configuration, the droplet 
does not expand like a gas would do but 
stays self-bound and behaves like a liquid. 
One defining particularity of this quantum 
liquid is that below a certain atom number, 
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it evaporates into a gas because quantum 
kinetic energy overcomes interactions. 
Both teams explored the resulting liquid- 
gas phase diagram by varying atom num- 
ber and interactions. 

The original theoretical prediction of 
quantum droplets was formulated for BEC 
mixtures, so the observation by Cabrera et 
al. and Semeghini et al. marks an impor- 
tant benchmark. However, quantum drop- 
lets also occur in other Bose-condensed 
systems in which two types of interactions 
exist. In particular, they have been ob- 
served in BECs where magnetic interac- 
tions occur on top of the usual zero-range 
interactions, such as the ones taking place 
in potassium (5—7). These interactions are 
repulsive, whereas the dipolar interaction 
can be made attractive, thus fulfilling the 
conditions for the liquid state. These mag- 
netic droplets were anisotropic and longer- 
lived. This diversity of self-bound liquids 
opens possibilities that so far were out of 
reach with BECs. As Cabrera et al. show, 
no theory yet describes the precise proper- 
ties of quantum droplets. Thus, these stud- 
ies will serve as benchmarks for quantum 
many-body theories. 

A question to now address is whether 
these droplets undergo evaporation and 
cool to nearly O K, as suggested theoreti- 
cally in (4). This very peculiar property 
comes from the fact that the self-bound 
liquid should be the only bound solu- 
tion so that any higher-energy excitation 
should be unbound and evaporated. Fur- 
thermore, constraining the liquids in lower 
dimensions should lead to new physics. It 
is indeed a known result from quantum 
mechanics that in three dimensions, an 
attractive potential cannot always bind a 
particle, but in two and one dimensions, 
it always can. Thus, the gas-liquid phase 
diagram would be modified, also because 
fluctuations play a stronger role (8, 9). 
Future studies can also be inspired by the 
rich physics of helium nanodroplets (10). 
For example, quantum droplets could act 
as solvents to trap and study impurity at- 
oms one by one. 
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MICROBIOLOGY 


A bacterial coat that ts not pure cotton 


Biofilms formed by EF: coli and Salmonella contain a new form of modified cellulose 


By Michael Y. Galperin! and 
Daria N. Shalaeva?? 


ellulose, a linear polymer of glucose 

residues, is the main component of 

plant cell walls and the most abundant 

biomolecule on the planet. Cellulose fi- 

bers from wood, cotton, and linen are 

mostly used as such, but can also be 
chemically modified to make rayon, viscose, 
and other textiles. Many bacteria also syn- 
thesize cellulose. Cellulose fibers produced 
by the model organism Komagataeibacter 
(Gluconacetobacter) xylinus are very similar 
to those found in plants (J) and are increas- 
ingly used in biotechnology and 
nanotechnology (2, 3). Escherichia 
coli and many other bacteria pro- 
duce cellulose as a key component 
of the extracellular matrix that 
coats the cells to form a biofilm, a 
complex multicellular community 
consisting of numerous bacteria, 
exopolysaccharides (like cellulose), 
protein fibers, and DNA (4-6). The 
cellulose in biofilms was assumed 
to be the same as that produced by 
G. xylinus, owing to the same pat- 
tern of staining with Congo red dye 
and the same cellulose synthase en- 
zyme (4-6). However, on page 334 of 
this issue, Thongsomboon et al. (7) 


report that E. coli and Salmonella 7 i. : hy 
Wie 4 
it) J 


het 
Cytoplasm 


enterica serovar Typhimurium pro- 
duce modified cellulose, in which 
every other glucosyl residue carries 
an additional phosphoethanolamine 
(pEtN) group. These findings have 
important implications for a wide 
variety of disciplines, from microbi- 
ology to materials science. 

It appears that previous studies 
simply overlooked the presence of 
this cellulose modification. Bacterial cellu- 
lose is a stable polymer that is resistant to a 
variety of harsh treatments. When such treat- 
ments were used to purify cellulose fibers 
from microbial biofilms, the pEtN group was 
lost. Thongsomboon et al. used mass spec- 
trometry and several versions of solid-state 
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nuclear magnetic resonance spectroscopy to 
detect and study the pEtN modification in 
vivo. They also showed that the pEtN group 
comes from the cell membrane lipid phos- 
phatidylethanolamine and identified BcsG, a 
subunit of the cellulose synthase complex, as 
the pEtN transferase responsible for catalyz- 
ing the pEtN modification. 

Microbiologists might be most interested 
in the wide phylogenetic distribution of the 
pEtN modification. In addition to EF. coli and 
Salmonella, BcesG is encoded in cellulose syn- 
thase operons of some important pathogens, 
including certain species of Klebsiella, Shi- 
gella, Enterobacter, and Burkholderia, as well 


E. coli cellulose synthase complex 
The findings of Thongsomboon et al. have helped to clarify the structure 
of the cellulose synthase enzyme, which consists of multiple subunits 
and is regulated by c-di-GMP. Cellulose can be modified with pEtN derived 
from the membrane lipid phosphatidylethanolamine, a reaction catalyzed 
by BcsG. This model is based on the data from (4, 7, 10-12, 15). 
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as the plant pathogen Erwinia spp. (4). Thus, 
biofilms formed by these organisms might 
also contain pEtN-modified cellulose. This is 
important because the addition of the pEtN 
group is likely to change biofilm properties. 
Thongsomboon e¢ al. show that pEtN-modi- 
fied cellulose forms a more compact biofilm 
mesh that is more resistant to shear forces. 
However, the presence of the pEtN group 
probably makes the cellulose fibers less rigid 
and less inert than those consisting of pure 
cellulose. This warrants reexamination of the 
sensitivity of E. coli and Salmonella biofilms 
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to detergents and enzymes that do not af- 
fect standard cellulose fibers but that might 
be active on pEtN-modified cellulose. Thus, 
the goal of preventing infection by disrupting 
such biofilms might be within closer reach. 

The work by Thongsomboon et al. should 
also help with understanding the exact 
role of the cellulose-containing biofilm as 
a virulence factor. Curiously, some of the 
worst disease-causing strains of E. coli and 
Shigella do not produce biofilms in acute 
infections (4, 8). It appears that the ability 
to form a biofilm allows these pathogens to 
better adapt to the host environment and 
shifts the infection from an acute to chronic 
state. Thus, improved understand- 
ing of pEtN cellulose biosynthesis is 
an important step toward fighting 
bacterial diseases. 

More generally, Thongsomboon 
et al. provide insight into the regu- 
lation of polymer export in bacte- 
ria. Bacterial cellulose biosynthesis 
depends on the second messenger 
cyclic diguanosine monophosphate 
(c-di-GMP) (4, 9-11). The structure 
of cellulose synthase (10) showed 
that c-di-GMP is required to allow 
the substrate access to the enzyme 
active site. The amount of cellulose 
produced by E£. coli also depends on 
the expression of the bcsEFG operon 
and one of its products, BesE, which 
binds c-di-GMP (12). However, the 
exact roles of these proteins have 
' remained obscure. The finding that 
J the catalytic domain of BcsG is lo- 
cated in the periplasm (between the 
inner and outer membranes) and 
modifies glucosyl residues of the 
nascent cellulose chain (7) allowed 
Thongsomboon et al. to predict the 
organization of the entire mem- 
brane-bound cellulose synthase complex (see 
the figure). This not only resolves the role of 
the bcsEFG operon, but also highlights the 
complexity of c-di-GMP-mediated regulatory 
processes. At least in E. coli and Salmonella, 
c-di-GMP seems to regulate cellulose forma- 
tion at several different levels. This multilevel 
regulation of important life-cycle decisions, 
such as biofilm formation, has been dubbed 
“sustained sensing” (13). It appears to be 
common in the microbial world but is diffi- 
cult to disentangle without a detailed analy- 
sis of regulatory interactions. 


sciencemag.org SCIENCE 


GRAPHIC: V. ALTOUNIAN/SCIENCE 


810g ‘g} Alenuer uo /Bio'bewseouslds a0ua!0S//:dyjy Wold papeojuUMOG 


Finally, the work by Thongsomboon et al. 
will benefit efforts to find new applications 
for bacterially synthesized cellulose and 
develop new cellulose-based compounds. 
Gluconacetobacter-produced cellulose mi- 
crofibers and crystals, commonly referred 
to as nanocellulose, have numerous appli- 
cations (2). The apparent biocompatibil- 
ity—lack of toxicity, immunogenicity, and 
proinflammatory response—of unmodified 
cellulose makes it an attractive choice for a 
variety of biomedical applications, such as 
drug delivery, wound dressing, replacement 
of blood vessels, and tissue engineering of 
bone and cartilage (2, 3). Thus, pEtN-modi- 
fied cellulose would have to undergo rigor- 
ous biocompatibility testing. Nevertheless, 
the availability of genetic tools to manipu- 
late E. coli opens numerous possibilities for 
using cellulose synthase genes for synthetic 
biology. A particularly interesting develop- 
ment could be the ability to produce entirely 
new kinds of cellulose films for applications 
ranging from optoelectronics to packaging. 

In principle, one could imagine produc- 
tion of cellulose microfibers with new modi- 
fications. The catalytic domain of BcsG is a 
metalloenzyme of the alkaline phosphatase 
and sulfatase superfamily (14), which is an- 
chored in the membrane by five predicted 
transmembrane helices (see the figure). Re- 
placing the catalytic domain of BcsG with a 
different enzyme, such as an acyltransferase 
or a glycosyltransferase, could allow biosyn- 
thesis of cellulose nanocrystals with new 
optical properties, increased conductivity, 
or the ability to bind metal ions. 
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Taking down defenses 
to improve vaccines 


A new approach to generating influenza virus 
vaccines could improve responses 


By John R. Teijaro! and Dennis R. Burton?” 


accines have been spectacularly suc- 

cessful in durable protection against 

a range of pathogens. However, they 

have been less successful against 

pathogens that have evolved immune 

escape mechanisms (J). For example, 
the influenza virus surface glycoprotein 
hemagglutinin (HA), which is the main 
target (antigen) for protective antibodies, 
shows enormous sequence diversity between 
different strains, meaning that antibodies 
induced by immune responses to one strain 
of the virus tend to be either inefficient or 
ineffective against other strains. This obser- 
vation is often associated with the need for 
a new influenza vaccine every year. How- 
ever, the escape mechanisms of influenza 
virus extend beyond antigenic variation of 
surface proteins. For example, wild-type vi- 
ruses typically encountered in natural infec- 
tion can suppress the host type I interferon 
(IFN-IT) response, which provides the first 
line of defense against viral infections and 
promotes stimulation of an optimal immune 
response (2). On page 290 of this issue, Du 
et al. (3) describe the generation of a variant 
influenza virus that, in contrast to the wild 
type, is hyper-interferon-sensitive (HIS) and 
therefore attenuated (reduced in virulence). 
Attenuated viruses typically have lower 
immune responses than their wild-type 
counterparts but, in this case, the level of 
attenuation still resulted in robust immune 
responses. The authors propose that the HIS 
approach could form the basis for a more ef- 
fective influenza vaccine. 

Following infection, viral nucleic acids 
are sensed by the host innate immune sys- 
tem through multiple pattern recognition 
receptors. Among the first antiviral pro- 
teins produced are IFN-I, a family of cyto- 
kines that signal to surrounding host cells 
and induce multiple interferon-stimulated 
genes (ISGs) that act to prevent viral am- 
plification and dissemination. In addition 
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to inhibiting viral replication, IFN-I sig- 
naling also promotes the optimal induc- 
tion of both the innate and adaptive arms 
of the immune response. To counteract a 
potent IFN-I response, many viruses en- 
code proteins that inhibit IFN-I production 
and/or signaling, highlighting the evolu- 
tionary importance of host IFN-I signaling 
in controlling early virus infection (4). In- 
fluenza virus is no exception and encodes 
nonstructural protein 1 (NS1), which exerts 


“The authors argue that the 
approach may be broadly 
applicable in creating 
efficacious live attenuated 
vaccines for a plethora of 
viral infections.” 


potent anti-IFN-I effects and is essential 
for viral fitness (5). 

To investigate the possibility of harness- 
ing IFN-I sensitivity for attenuated vaccine 
design, Du et al. employed a quantitative 
high-throughput genetic mutagenesis system 
coupled to next-generation sequencing (6, 7) 
to simultaneously measure loss-of-function 
mutations in influenza virus in the presence 
or absence of IFN-I signaling. Using this ap- 
proach, they identified multiple IFN-I-sen- 
sitive mutations across the influenza virus 
genome, including mutations outside of the 
gene encoding NSI. The authors then com- 
bined an assortment of eight IFN-I-sensitive 
mutations in multiple viral genes to create a 
HIS virus. The HIS influenza virus increased 
both IFN-I sensitivity and production, atten- 
uated viral fitness in vitro and in vivo, and 
did not produce disease pathology after high- 
dose administration to mice and ferrets (com- 
monly used models of influenza infection). 
Importantly, despite multiple mutations in 
viral genes in the HIS virus, enough anti- 
genic determinants were conserved to gener- 
ate antiviral T and B cell-mediated immune 
responses (which involve antibodies) to pro- 
mote protection after challenge with multi- 
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Generating live attenuated HIS virus vaccines 
HIS viruses confer improved vaccination against influenza virus strains. This approach could be used to generate more effective live attenuated vaccines against other viruses. 
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Libraries of mutant influenza virus strains 
are screened for IFN-I sensitivity. IFN-I- 
sensitive strains are selected and sent for 
RNA sequencing. 


ple strains of influenza virus when mice were 
vaccinated with HIS viruses derived from the 
HINI1 subtype and subsequently challenged 
with either HINI or H3N2 subtypes in mice 
and ferrets (see the figure). The protection 
generated by the HIS influenza virus vaccine 
is likely mediated through the generation of 
cross-reactive T cells, which can react with 
multiple viral strains. This is especially likely 
to be the case because mice vaccinated with 
the H1N1-HIS virus and subsequently chal- 
lenged with heterologous H3N2 influenza 
viruses produce potent protective antibod- 
ies against the homologous HINI1 virus that 
will not have substantial cross-reactivity with 
H3N2 viruses. 

The authors argue that the approach may 
be broadly applicable in creating efficacious 
live attenuated vaccines for a plethora of vi- 
ral infections. Although a concern with live 
attenuated vaccines is often their potential 
to revert to virulence (8), attenuating the vi- 
rus with multiple point mutations, as in Du 
et al., is likely to avert this outcome. In ad- 
dition to increasing safety, the use of muta- 
tions scattered throughout the viral genome 
should provide a barrier to the development 
of viral resistance. Moreover, the avoidance 
of mutations in HA should help preserve ro- 
bust antibody responses. 

The holy grail of the influenza virus 
vaccine field is a universal influenza vac- 
cine (9). Such a vaccine would obviate the 
need for an annual vaccine. Furthermore, 
if providing durable protection, such 
a vaccine could be given earlier in life, 
when the induction of immune responses 


278 19 JANUARY 2018 + VOL 359 ISSUE 6373 


Ha] 


Antigenic determinants 
are retained. 


IFN-sensitive strains are sequenced 
and mutations incorporated into a 
single virus, creating a HIS virus that 
is attenuated in vitro and in vivo. 


Vaccinate 
Naive mouse or ferret 


‘ 


is more optimal, rather than later in life, 
when susceptibility to serious infection 
is higher (70) and vaccine efficacy signifi- 
cantly declines (17). The approach of Du 
et al. may be a step toward a universal in- 
fluenza vaccine in that it will be safe and 
retain sufficient antigenic determinants to 
promote immune protection to multiple 
viral strains. However, many challenges re- 
main. Data on cross-protection are limited 
to exposure to a small set of strains from 
the H1IN1 and H3N2 subtypes of influenza 
viruses. It would be valuable to test ad- 
ditional viruses, including highly virulent 
avian subtypes such as H5N1 and H7N9, 
during subsequent challenge studies. 
Another limitation is that the approach 
relies heavily on T cell immunity for cross- 
strain protection. However, much evidence 
suggests that B cell immunity in the form of 
antibodies can be very important in protec- 
tion against influenza virus and indeed an- 
tibodies are the oft-used correlate of likely 
vaccine efficacy (12). In terms of the HIS 
approach and antibody-mediated responses, 
the diversity of the HA molecule is a prob- 
lem because the approach is largely expected 
to induce strain-specific antibodies and the 
HA molecule rapidly mutates under immune 
selection pressure. It might be beneficial to 
combine the HIS approach with one that 
targets the induction of broadly neutraliz- 
ing antibodies (bnAbs), which can bind and 
neutralize diverse influenza virus strains. For 
example, one could use emerging constructs 
being developed that promote the genera- 
tion of HA bnAbs that recognize the stem re- 
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1 Naive mice or ferrets are vaccinated 


with the HIS influenza virus. 2 Mice or 
ferrets are challenged with homologous 
or heterologous virus strains. 
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Animals challenged with HIS virus 
exhibited superior protection compared 
to mock challenged animals with respect 
to both reduced mortality and viral loads. 
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gion of the HA molecule, which is relatively 
conserved for different viral strains (9). Us- 
ing a combined approach might increase the 
titer of difficult-to-induce bnAbs. Moreover, 
the added T cell immunity could further 
improve overall efficacy in humans. Indeed, 
there is evidence that both T and B cell im- 
munity can contribute to vaccine protection 
against viral infections (12). 

It will also be important to determine 
whether the approach of Du e¢ al. can be 
extended to the generation of more effec- 
tive live attenuated vaccines against other 
viruses. As part of such studies, the genetic 
analysis could be applied to other critical in- 
nate immune signaling pathways such as mu- 
tating viral proteins that attenuate the viral 
RNA-sensing retinoic acid-inducible gene-I 
(RIG-I) and melanoma differentiation associ- 
ated protein-5 (MDA-5) (73) as well as viral 
proteins that antagonize IFN-I signaling. 
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DNA ORIGAMI 


Remote control of nanoscale devices 


A DNA nanodevice can be manipulated with an applied electric field 


By Bjorn Hoégberg 


rocesses that occur at the nanometer 

scale have a tremendous impact on 

our daily lives. Sophisticated evolved 

nanomachines operate in each of our 

cells; we also, as a society, increasingly 

rely on synthetic nanodevices for 
communication and computation. Scientists 
are still only beginning to master this scale, 
but, recently, DNA nanotechnology (1)—in 
particular, DNA origami (2)—has emerged 
as a powerful tool to build structures precise 
enough to help us do so. On page 296 of this 
issue, Kopperger et al. (3) show that they are 
now also able to control the motion of a DNA 
origami device from the outside by applying 
electric fields. 

Self-assembled in a test tube from hun- 
dreds of single strands of DNA, DNA 
origami devices can form almost any con- 
ceivable shape at the nanoscale. Shapes in 
and of themselves, however, do not make 
a machine. There is increasing interest in 
nanotechnological approaches that can per- 
form tasks on the nanoscale, particularly 
within the human body itself. Kopperger et 
al. make an impressive stride in this direc- 
tion by creating a dynamic DNA origami 
structure that they can directly control from 
the macroscale with easily tunable electric 
fields—similar to a remote-controlled robot. 

Kopperger et al. immobilized a typical 
three-dimensional (3D) DNA origami (4) 
platform attached to a flexible rod on a 
glass slide. This rod essentially looks like 
a gear stick that protrudes out of a square 
platform locked in place on the glass (see 
the figure). Most of the structure is made 
out of relatively rigid DNA double helices 
that are bundled together, but the stick it- 
self is attached to the platform by flexible 
single-stranded DNA pieces and can thus 
rotate with respect to the fixed platform. 

Under normal conditions, when dis- 
persed in common water-based buffers, 
DNA is a charged molecule. Therefore, ap- 
plying an electric field to the buffer solu- 
tion surrounding the platform will force the 
charged DNA stick to move in the direction 
of the field (see the figure). This simple idea 
is the basis for the work. Earlier work has 
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used similar concepts (5), but the flexible 
arm in conjunction with the fixed platform 
in the present work allows the authors to 
show precise positional control with respect 
to the platform. To achieve this, the authors 
use a system of latches. These are short, 
protruding single-stranded DNAs that mo- 
mentarily grab the arm and lock it in place 
at predefined positions. With these in place, 
the arm can only be forced into a new posi- 
tion by using enough applied electric force. 

Again, the analogy with a gear stick might 
be appropriate. Think of these latches as the 
predefined shift pattern on a manual-trans- 
mission lever: once in position 1, some force 
is required to shift the stick to position 2, and 
so on. This construction allowed the research- 
ers to precisely move the lever from position 
to position, despite the relatively rough align- 
ment of the external electric field. Kopperger 
et al. demonstrate these and other capabili- 
ties of their system using total internal re- 
flection fluorescence (TIRF) microscopy, an 
imaging technique that allows researchers to 
precisely track molecular motions. 

Although the experiment in itself is fas- 
cinating, a lot of the promise lies in what 
could potentially be achieved when this ap- 
proach is combined with other recently pub- 


lished techniques. Combining the latching 
system with precise lithographic patterning 
and optical readout (6) could possibly form 
the basis for a new type of digital memory. 
Remote-controlled picking up and placing of 
molecular components could potentially be 
accomplished if this concept was combined 
with cargo-transfer techniques (7, 8). If this 
became possible, one could imagine a future 
where molecules could be put together fol- 
lowing orders from an outside operator. 
DNA origami is effectively already a 3D- 
printer system for the nanoscale (9, 10), but 
with some of these extensions, Kopperger et 
al.’s technique could enable true 3D printing 
of molecules. 
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Controlled nanoscale motion 
Kopperger et al. have created a DNA arm that can 


be rotated with an external electric field. Single-stranded 


DNAs (ssDNAs) that hybridize to each other act as 


latches that fix the arm in position until sufficient electric 


force is applied to rotate the arm. 
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RETROSPECTIVE 


Ben Barres (1954-2017) 


A passionate neuroscientist and advocate of equal 


opportunity in the sciences 


By Martin Raff 


en Barres transformed our under- 
standing of brain glial cells. He lived 
an extraordinary life and died too 
young after a 2-year battle with pan- 
creatic cancer. From early childhood, 
he suffered greatly from gender dys- 
phoria, until transitioning from Barbara 
to Ben at age 43. This provided enormous 
relief, as well as rare insight into how dif- 
ferently females and males 
are treated, fueling a pas- 
sionate intolerance to 
prejudice. He was the first 
openly transgender person 
elected to the U.S. National 
Academy of Sciences. 

Barres was born and 
raised in West Orange, New 
Jersey, with the given name 
Barbara. Neither of his par- 
ents went to university, and 
they and their three other 
children showed no interest 
in science, whereas Barres 
wanted to be a scientist by 
age 5, excelled in science 
and math at school, and 
took extra classes in these subjects at nearby 
universities. As a scholarship undergradu- 
ate student at the Massachusetts Institute of 
Technology, he took a brain science course 
that sparked his interest in neuroscience. Af- 
ter receiving a medical degree at Dartmouth 
Medical School, he completed a neurology 
residency at the Cornell Cooperating Hospi- 
tals in New York. 

Having decided to become a neurosci- 
entist, he entered the neuroscience Ph.D. 
Program at Harvard Medical School. After 
several lab rotations, he chose to work with 
David Corey, an auditory hair-cell physiolo- 
gist with expertise in the patch-clamp tech- 
nique that Barres wanted to use to hunt for 
ion channels in glial cells in the mammalian 
central nervous system (CNS)—cells that 
had become his passion since his neurology 
residency. While the Corey lab was a bold 
choice, as it had no interest or experience 
in glial cells, Barres’s earlier decision to 
devote his life to understanding these cells 
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was even bolder, as they were thought to be 
uninteresting and relatively unimportant. 
Barres, however, suspected that they were 
far more important than recognized and 
spent the next 34 years proving it. 

Early in his graduate work, Barres real- 
ized the advantages of identifying and iso- 
lating the different types of CNS cells and 
using cell-type-specific antibodies to do so. 
This approach enabled him to use patch- 
clamp recording to discover an unexpected 
variety of ion channels 
in the plasma membrane 
of three types of glial 
cells—astrocytes, whose 
functions were mostly un- 
known; oligodendrocytes, 
which myelinate CNS ax- 
ons; and oligodendrocyte 
precursor cells (OPCs). The 
findings were new and im- 
portant, making him an 
internationally recognized 
player in the glial field. 

Barres next spent 3 years 
as a postdoc in my lab at 
University College London, 
where he studied the early 
development of oligoden- 
drocytes. Working largely independently, he 
devised a method to purify OPCs from de- 
veloping rat optic nerve and to culture them 
at low density under conditions where they 
would survive, proliferate, and differentiate 
into oligodendrocytes. This allowed him to 
show how various known extracellular sig- 
naling molecules influence these three pro- 
cesses at the level of individual cells. Among 
his many discoveries at University College 
London was that newly formed oligoden- 
drocytes in the developing rat optic nerve 
normally die in large numbers in a compe- 
tition for limiting amounts of survival sig- 
nals on axons—an elegant mechanism for 
matching the number of oligodendrocytes 
to the number and length of axons during 
CNS development. 

The most important discoveries, how- 
ever, came after starting his own lab at 
Stanford University at age 39, where Barres 
remained for the rest of his career, attract- 
ing many outstanding students and post- 
docs and transitioning to Ben after 4 years. 
The Barres lab made a seemingly endless 
series of astonishing discoveries about how 
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glial cells function in the normal and dis- 
eased rodent and human brain. 

Most of the lab members studied glial- 
neuronal interactions, working out ways 
to purify different cell types and keep 
them alive in culture, which took years of 
pioneering work. Once they had demon- 
strated a glial cell influence on a neuron, 
it took even longer to identify the mol- 
ecules mediating the interaction—usually 
signaling molecules secreted by the glial 
cell and the complementary receptors on 
the neuronal surface. They then used mice 
deficient in these molecules to verify their 
roles in vivo. 

I can mention only a few of the many 
transforming discoveries from the Barres 
lab. They showed that astrocytes are re- 
quired for many CNS neuronal synapses 
to form, mature, and function normally, 
and they identified the signal molecules 
and receptors mediating these interac- 
tions. The Barres lab also demonstrated 
that both astrocytes and microglia (the 
specialized macrophage-like glial cells in 
the CNS) have crucial and distinct roles in 
synapse elimination, which occurs during 
normal neuronal development and in many 
neurodegenerative disorders; moreover, 
they identified distinct phagocytic path- 
ways involved in each case, including the 
surprising role of complement proteins in 
the microglial pathway. Recently, they dis- 
covered a type of reactive astrocyte that de- 
velops in some forms of brain injury, which 
secretes toxins that kill injured neurons and 
oligodendrocytes. This type of astrocyte is 
greatly increased in various neurodegenera- 
tive diseases, where it is thought to contrib- 
ute to neuronal cell death, making it and its 
toxins potential targets for drug discovery. 

Barres’s influence extended well beyond 
his lab’s discoveries. He was an esteemed 
teacher, mentor, and chair of Stanford’s Neu- 
robiology Department. He organized glial 
meetings and courses and wrote many per- 
ceptive reviews. He made his lab’s invaluable 
experimental protocols, reagents, and cell- 
type-specific transcriptome databases freely 
available, which greatly accelerated progress 
in the field. He fought energetically and effec- 
tively for the many issues he was passionate 
about, including the plight of women, minor- 
ities, and transgender individuals in society 
and especially in science. 

Ben’s death leaves a gaping hole in the 
glial field he dominated for so long. He is 
irreplaceable: His intelligence, passion, 
generosity, and childlike enthusiasm and 
frankness will be deeply missed by the re- 
markably large number of us who cherished 
him and owe him so much. 


10.1126/science.aas9270 
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COMPLEXITY 


Quarks, culture, combogenesis 


A multidisciplinary tour of cosmic history charts 
the “grand sequence” of existence 


By Barry Wood 


he value of Tyler Volk’s Quarks to 

Culture is evident when the book is 

placed against popular histories of 

the universe, dozens of which have 

provided evidence for an immense 

cosmic past. But such histories are 
often anecdotal, like early British 
histories of the kings of England. 
Unlike these works, Volk artfully 
presents the case for structural 
continuity and systematic cre- 
ativity across 13.8 billion years of 
cosmic history. 

He begins with a simple ob- 
servation: “As you go down into 
the body, you go back in time: 
from the body inward to cells, 
to molecules, and then to atoms. 
Passing from life to physics, 
each first type in this series of 
nested things came into existence earlier.” 
The primary achievement of his book is the 
clear articulation of the temporal sequence 
in which the smallest particles combine to 
form atoms, which form molecules, then 
cells, and eventually organisms. He fur- 
ther extends his analysis to dynamically 
related tiers that ascend to animal herds 
and hives, tribal associations, and eventu- 


The reviewer is at the Department of English, University of 
Houston, Houston, TX 77204, USA. Email: bwood@uh.edu 
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Quarks to Culture 
How We Came to Be 
Tyler Volk 
Columbia University 
Press, 2017. 280 pp. 


ally large societies. Volk goes beyond cur- 
rent concepts of emergence, complexity, 
self-organization, and autopoiesis with a 
sustained and impressive presentation of 
“combogenesis,” his own term for the in- 
novative creativity of the physical, biologi- 
cal, and cultural realms. The fact that he 
accomplishes this grand sweep within just 
250 pages makes the book a su- 
perb contribution deserving of 
wide readership. 

Volk’s study traces the se- 
quence of combogenesis in 12 
chapters. A series of parallel 
diagrams at the head of each 
chapter clarifies the combinatory 
steps from each level to the next. 
Acknowledging familiar meta- 
patterns of layers, hierarchies, 
thresholds, and domains, he em- 
phasizes the structural relation 
of these tiers as “nested.” The 
term is most obviously applicable to quarks, 
nucleons, and atoms but takes on meta- 
phoric richness in the higher biological and 
cultural realms. Principles that may seem 
specific to agrovillages or geopolitical states 
are nested within the governing limitations 
of earlier structures while manifesting the 
creative integration evident at higher levels. 

A recent emphasis on “big history,’ which 
ranges over similar territory, is recast as 
“grand sequence” in Quarks to Culture—a 
term that bypasses the march of events in 
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Acomplex, colonial organism forms when individual 
red-spotted siphonophores come together. 


order to highlight the mystery of tempo- 
ral change as the fundamental reality. Volk 
skips the usual rehearsal of geological eras, 
land colonization, continental drift, and 
dinosaur extinction, focusing instead on 
the big picture. In doing so, he manages to 
frame a complex welter of multidisciplinary 
information as a “narrative of the universe” 
that is creative and compelling. 

Despite the book’s emphasis on story- 
telling, Volk must occasionally resort to 
numbers to impress upon the reader the 
complexity of contemporary existence. 
A suite of fundamental particles is orga- 
nized into 92 elements, he writes, which 
are combined in hundreds of molecules 
before the rise of life. Millions more come 
into being after the rise of life, and perhaps 
100 billion can be found in what Volk calls 
the “protein sequence space.” His rigorous 
philosophical emphasis assumes that his 
readers are already aware of the vastness 
of time and space, nuclear fusion in stars, 
the evolution of the solar system, and theo- 
ries of life’s origin. 

Communication, Volk shows, also has a 
deep history: from animal calls to speech 
among tribal associations and agrovillages 
to writing in geopolitical states. In Chapter 
16, he introduces the concept of “alphakits”: 
26 letters and 40 phonemes that have given 
rise to thousands of words and millions of 
linguistic artifacts and provided a founda- 
tion for cultural construction, preservation, 
and transfer. But the term applies to lower 
levels, too, he argues: A suite of quarks, el- 
ements, amino acids, and proteins consti- 
tutes early alphakits for evolution. 

In a recent Science Editorial, Gordon 
McBean and Alberto Martinelli noted, “De- 
spite decades of efforts toward better in- 
tegration, much of society still presumes a 
stark divide between the disciplines, and 
most scientists continue to be trained, 
evaluated, and rewarded in disciplinary 
silos” (1). The “grand sequence” advocated 
in Quarks to Culture is, as they and others 
have argued, largely unrecognized. Volk, 
however, marshals evidence from a dozen 
disciplines filtered through a rigorous intel- 
lect with grace and without polemics. 

With impressive learning, rigorous anal- 
ysis, and artful writing, Quarks to Culture 
presents a unifying philosophy reminis- 
cent of Alfred North Whitehead’s explora- 
tion of process. 


REFERENCE 
1. G.McBean, A. Martinelli, Science 358, 975 (2017). 
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Enrico Fermi, flaws and all 


A revealing biography falls short when it comes to the 
famous physicist’s problematic treatment of women 


By Megan Formato 


ith the title The Last Man Who 
Knew Everything and a first 
chapter entitled “Prodigy,” a 
reader could be forgiven for ex- 
pecting David Schwartz’s new 
biography of Enrico Fermi to be 
a straightforward hagiography. Luckily, 
Schwartz’s ambitions are not as simple as 
providing yet another account of a great 
man of 20th-century physics. He has other, 
thornier questions in mind, some of which 
he credibly addresses and others that he 
handles less convincingly. 

Moving from Fermi’s birth 
in Rome in 1901 to his death 
in Chicago in 1954, The Last 
Man grapples with Fermi’s 
legacy as a teacher and men- 
tor, his contributions as a sci- 
entist in Mussolini’s Italy and 
later to the Manhattan Project 
in the United States, and his 
change of heart about the hy- 
drogen bomb after World War 
II. (Fermi initially opposed the 
project but later worked on it.) 

We learn how Fermi’s sci- 
ence was shaped by his fam- 
ily, by the many educational 
and scientific institutional 
contexts through which he 
moved, and by major political 
moments in both Italy and the 
United States. 

The Last Man Who Knew 
Everything is at its best in 
the chapters devoted to the 
Manhattan Project. Following 
Fermi gives us a vantage point that is, at 
least initially, more focused on Columbia 
University and the University of Chicago 
than Los Alamos. 

Shifting the focus to these locations al- 
lows us to see in great detail the institutional 
cultures and social worlds that the Fermis 
helped maintain. Here, Schwartz’s science 
communication skills also shine with pa- 
tient, easy to follow, nontechnical descrip- 
tions of the construction of fission piles. 
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Enrico Fermi sits with his wife, Laura, an author and historian, in 1954. 


One of Schwartz’s sustained insights is 
that “[cJalculations of probabilities run 
like a bright thread throughout [Fermi’s] 
work.” It is particularly helpful to see 
his approach to the Manhattan Project 
through this statistical lens. 

Fermi, we learn, was initially disinclined 
to keep the Manhattan Project a secret be- 
cause he thought the development of such 
a weapon was statistically unlikely. His un- 
derstanding of the probability of a fission 
weapon shifted during a conversation with 
the German physicist Werner Heisenberg in 
Michigan in 1939. 


Schwartz’s account of this exchange is 
one of the most illuminating in the book 
because it reveals a turning point for Fermi 
and therefore for the Manhattan Project. 

Fermi and Heisenberg disagreed on 
what kind of agency and dissent is pos- 
sible for scientists working under fascist 
governments. When Heisenberg returned 
to Germany, Fermi was convinced that 
he would work on a fission weapon for 
Hitler, a belief that instilled a sense of 
greater urgency in Fermi with regard to his 
own work. 

The book is less compelling in early 
chapters when, in order to persuade the 
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reader that Fermi’s genius was recognized 
in its own time, Schwartz strays more fre- 
quently into cliché and hyperbole. It is 
hard to square the description of Fermi as 
“already a legend” or “the Voltaire of phys- 
ics” during his time at Scuola Normale 
and university in Pisa alongside nearly si- 
multaneous mentions that he was almost 
expelled for a prank and had difficulty on 
his exams. 

More troubling is Schwartz’s handling of 
Fermi’s treatment of women. Describing a 
letter Fermi wrote to a male friend about a 
skit in which Fermi “ridiculed [Women]— 
‘barring one or two exceptions 
ugly enough to scare anybody"— 
by portraying them as incapable 
of reducing a simple fraction,” 
Schwartz acknowledges Fermi 
showed “a distinctly unattract- 
ive attitude toward his fellow 
female students.” 

However, he quickly sets the 
skit aside, writing: “Perhaps 
it reflected merely the widely 
shared prejudices of his time 
and culture. More than likely, it 
was also a bravado with which 
he could mask his awkwardness 
around women, an ineptness he 
would eventually outgrow.” 

Fermi’s lifelong teasing of his 
wife is similarly acknowledged 
throughout the book but then 
minimized as an “idiosyncrasy” 
that did not interfere with their 
marriage. By raising some of 
Fermi’s misogynistic behaviors 
only to repeatedly brush them 
aside, Schwartz props up the 
tired, problematic trope of the eccentric 
male genius who is not held responsible for 
his destructive social behavior. 

At its best, The Last Man Who Knew Ev- 
erything resists drawing too clean a line 
between the personal and the scientific in or- 
der to explore how Fermi’s contributions as 
a scientist were contingent on “the specific 
circumstances” of his life. But by not letting 
Fermi’s mistreatment of women more thor- 
oughly inform the biography, Schwartz falls 
short of his ambition to provide a complex 
portrait of the famous physicist. 


10.1126/science.aar2948 
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The pitfalls of taking 
science to the public 


Daily newspapers play a fundamental role 
in informing the public about research 
findings and discussions. Scientists, in 
turn, are increasingly expected to interact 
with the media. However, this system is 
devoid of peer review, and it is the editors, 
not authors, that often have the final say 
on how a piece is worded. 

On 22 November 2017, The Washington 
Post published a perspective piece on the 
inevitability of large-scale extinctions (J). 
Stiff criticism followed on the comments 
page, social media, and blogs (2-5). We 
launched an effort to show that the views 
in that article do not represent the sci- 
entific majority (6). On 15 December, The 
Washington Post ran our response repre- 
senting more than 3000 scientists from 88 
countries, including many prominent sci- 
entists and Nobel laureates (7). Meanwhile, 
the perspective’s author posted an essay 
on his laboratory blog expressing regret 
about how his words had been edited and 
interpreted (8). In a sense, the scientific 
record has been corrected. However, public 
debate is a matter of timing and scale. Our 
response ran after the peak of the original 
discussion, and the author’s later com- 
ments ran on a local platform likely viewed 
by far fewer people than the perspective. 
Hence, the information that reached read- 
ers was fragmentary and skewed. 

This event exemplifies why scientists, 
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before they approach the media, should 
understand the differences between 
publishing in scientific journals and pub- 
lishing in news outlets. In the absence of 
peer review, it is essential to ask for advice 
about how text about scientific concepts 
could be misconstrued by nonspecialists. It 
is also helpful to suggest titles, which are 
often written by news staff who aim for 
accuracy but risk making sensationalistic 
statements. Scientists must also weigh the 
power of joint initiatives with the impact 
of speedy responses by individuals. 

We urge scientists to continue engaging 
with the media, but beware these pitfalls. 
Ask for advice from colleagues or your 
communications office, request review and 
approval of any text before publication, 
and hurry up—without compromising the 
quality of your contribution. 
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Vaccine mandates in 
France will save lives 


In their Letter “France’s risky vaccine man- 
dates” (27 October 2017, p. 458), J. K. Ward 
et al. question the adoption of mandatory 
vaccination in France. Their prediction 
that such a step will encourage resistance 
to vaccination is unsupported by the facts 
and could prolong a dangerous situation 
in which French citizens have the right to 
allow their children to catch and transmit 
potentially fatal infections. 

The French recommendation—which 
has now gone into effect (7)—was the 
product of two juries composed of both 
medical professionals and lay citizens 
(2), suggesting that Ward et al’s con- 
cerns about acceptance by doctors and 
the public are unfounded. Moreover, 
evidence shows that mandates are effec- 
tive. In California, immunization rates 
increased after so-called “philosophical 
exemptions” were eliminated (3). 

Vaccine-hesitancy in French physicians 
has been found to be only moderate in 
prevalence (4). Even one vaccine-hesitant 
doctor is too many, but Ward e¢ al. do 


New French vaccine mandates are now in effect. 
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not offer a solution to the problem, such 
as better education by medical schools. 
Furthermore, a reference cited by Ward 
et al. does not, as they claim, show that 
mandating vaccines increases anti- 
vaccinationism, but rather that citing 
dangers of diseases is more effective than 
arguing for safety of vaccination (5). 

Ward e¢ al’s reasoning could be extrapo- 
lated to argue against mandating car seats 
for young children. Like car seats, vaccina- 
tion mandates will likely save lives. 
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‘Buddhist 
eat lunch in 
eir-monastery. 


Have your momos and eat them, too 
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Sitting at breakfast with a group of Tibetan Buddhist monks at a monastery in southern 
India, | am surprised to see a plate of chicken on the table. “Is vegetarianism not required 
at the monastery?” | ask. Their reply is unexpected: Although no meat is served in the 
monastery dining hall, individual monks can choose to eat meat elsewhere, including the 
monastery’s guesthouse where | and my fellow scientists are staying. 

On my previous visit to northern India, | was similarly surprised to find monks leaving 
the monastery to eat Tibetan meat buns (“momos”) at nearby restaurants. At the time, 
| thought this reflected the attitudes of the monks themselves; they liked eating meat 
so much that they would leave the monastery to do it! But with this second trip and 
the additional context, | realize that the situational flexibility of meat eating is part 
of Tibetan Buddhism itself. When the monks are at the monastery, they eat no meat 
because the monastery is vegetarian. But that doesn’t mean that they are unable to 
eave and eat meat on their own. 

This experience shed light on my research, an exploration of ethical issues in neurosci- 
ence with the monastics as part of the Science for Monks program. | had asked individual 
monks for help solving ethical puzzles faced by scientists in the United States. Often 
their responses would avoid the puzzle altogether. When | asked whether it was right for 
people to augment their abilities with medication, they wondered why people would use 
enhancement technologies in the first place if they were neither free of negative side 
effects nor effective in the long term. What if people didn’t debate whether to eat meat or 
not, but assessed how important it was to eat meat in each situation? The monks, hap- 
pily eating their chicken for breakfast, seem perfectly content with this arrangement. 
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3D-printed interconnected plastic 
modules for chemical production 
Kitson et al., p. 314 
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VASCULAR BIOLOGY 
Lymphatics limp along 
after MRSA 


Lymphedema is associated 
with skin and soft tissue 
infections, and both can be 
recurring, causing continual 
suffering in affected patients. 
To better understand the 
relationship between bacterial 
infections and lymphedema, 
Jones et al. used intravital 
imaging. They examined the 
lymphatics of mice infected 
with MRSA (methicillin-resis- 
tant Staphylococcus aureus) 
and observed lymphatic 
muscle cell death, which led to 
prolonged dysfunction months 
after the bacteria had been 
cleared. In vitro experiments 


with human cells indicated bac- 


terial toxins were responsible 
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for damaging the lymphatic 
muscle cells. The findings sug- 
gest that these bacterial toxins 
could be targeted in patients to 
interrupt this brutal cycle. —LP 
Sci. Transl. Med. 10, eaam7964 (2018). 


NANOROBOTICS 
Electrically driving 
aDNAarm 


Most nanoelectromechanical 
systems are formed by etch- 

ing inorganic materials such as 
silicon. Kopperger et al. improved 
the precision of such machines by 
synthesizing a 25-nm-long arm 
defined by a DNA six-helix bundle 
connected to a 55 nm—by-55 nm 
DNA origami plate via flexible sin- 
gle-stranded scaffold crossovers 
(see the Perspective by Hogberg). 
When placed in a cross-shaped 


Migrants walking through 
the Little Ethiopia area de AN 3 
of Los Angeles, California Data-driven refugee 
assignment 


he continuing refugee crisis 
has made it necessary for 
governments to find ways 
to resettle individuals and 
families in host communi- 
ties. Bansak et al. used a machine 
learning approach to develop an 
algorithm for geographically placing 
refugees to optimize their overall 


electrophoretic chamber, the 
arms could be driven at angular 
frequencies of up to 25 Hz and 
positioned to within 2.5 nm. The 
arm could be used to transport 
fluorophores and inorganic 
nanoparticles. -PDS 

Science, this issue p. 296; 

see also p. 279 


SYNTHETIC BIOLOGY 
Large-scale gene 
synthesis in tiny droplets 


Gene synthesis technology is 
important for functional charac- 
terization of DNA sequences and 
for the development of syn- 
thetic biology. However, current 
methods are limited by their low 
scalability and high cost. Plesa 
et al. developed a gene synthesis 
method, DropSynth, which uses 
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employment rate. The authors 
developed and tested the algorithm 
on segments of registry data from 
the United States and Switzerland. 
The algorithm improved the employ- 
ment prospects of refugees in the 
United States by ~40% and in 
Switzerland by ~75%. —BJ 


Science, this issue p. 325 


barcoded beads to concentrate 
oligos and subsequently assem- 
ble them into synthetic genes 
within picoliter emulsion droplets. 
DropSynth allows generation of 
large libraries of thousands of 
genes and functional testing of all 
possible mutations of a particular 
sequence. —SYM 

Science, this issue p. 343 


VACCINES 
Avoiding interferon 
avoidance 


Interferon (IFN) expression is a 
mammal’s first response to viral 
infection. Many viruses have 
thus evolved mechanisms to 
evade IFN. Du et al. developed 

a method to systematically 
ablate IFN evasion genes from 
live, attenuated influenza virus 
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(see the Perspective by Teijaro 
and Burton). A combination 
of mutants was assembled to 
construct a virus that triggered 
transient IFN responses in mice 
but that was unable to replicate 
effectively. The transient IFN 
responses led to robust antibody 
and memory responses that 
protected against subsequent 
challenge with different influenza 
viruses. This approach could be 
adapted to improve other RNA 
virus vaccines. —CA 

Science, this issue p. 290; 

see also p.277 


QUANTUM FLUIDS 


Making dilute 
quantum droplets 


In recent years, quantum fluids 
have been studied largely in 
gaseous form, such as the Bose- 
Einstein condensates (BECs) 
of alkali atoms and related 
species. Quantum liquids, other 
than liquid helium, have been 
comparatively more difficult to 
come by. Cabrera et al. com- 
bined two BECs and manipulated 
the atomic interactions to create 
droplets of a quantum liquid 
(see the Perspective by Ferrier- 
Barbut and Pfau). Because the 
interactions were not directional, 
the droplets had a roughly round 
shape. The simplicity of this 
dilute system makes it amenable 
to theoretical modeling, enabling 
a better understanding of quan- 
tum fluids. —JS 

Science, this issue p. 301; 

see also p. 274 


STRUCTURAL BIOLOGY 
Recognizing centromere 


by kinetochore 

The kinetochore proteins CENP-N 
and CENP-C recognize the 
histone H3 variant CENP-A in 

the centromeric nucleosome. 
This ensures proper kinetochore 
assembly and accurate segrega- 
tion of chromosomes. Chittori 

et al. describe the cryo—electron 
microscopy structure of the 
human CENP-A nucleosome- 
CENP-N complex. The interaction 
of CENP-N with CENP-A and 

the nucleosomal DNA together 
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ensure specific and stable 
centromeric nucleosome rec- 
ognition. Mutational analyses 
using both human and Xenopus 
CENP-A and CENP-N proteins 
suggest that the proteins have 
coevolved to preserve the inter- 
acting surfaces. -SYM 

Science, this issue p. 339 


INDUCED SEISMICITY 
Seismicity curbed 


by lowering volume 
Determining why hydraulic 
fracturing (also known as 
fracking) triggered earth- 
quakes in the Duvernay 
Formation in Canada is 
important for future hazard 
mitigation. Schultz et a/. found 
that injection volume was 
the key operational param- 
eter correlated with induced 
earthquakes in the Duvernay. 
However, geological factors 
also played a considerable role 
in determining whether a large 
injection volume would trigger 
earthquakes. These findings 
provide a framework that may 
lead to better forecasting of 
induced seismicity. —BG 
Science, this issue p. 304 


RESEARCH METHODS 
Algorithms fail to 


improve predictions 
In the United States, algorithms 
are commonly used to predict 
the likelihood that a criminal 
defendant will commit a crime, 
and these predictions influence 
pretrial, parole, and sentencing 
decisions. Commercial soft- 
ware, such as the widely used 
COMPAS system, promises to 
make these predictions more 
accurate than human judg- 
ments. Dressel and Farid show 
that COMPAS's impressive- 
sounding 137-feature black box 
is nearly equivalent to a trivial 
linear classifier using two fea- 
tures, and both approaches are 
no more accurate or fair than 
predictions made by people 
with little or no criminal justice 
expertise. —AC 
Sci. Adv. 10.1126/ 
sciadv.aao5580 (2018). 


19 JANUARY 2018 * VOL 359 ISSUE 6373 


IN OTHER JOURNALS 


« 


GEOPHYSICS 
Going dry in the 
Pacific Northwest 


Volcanic belts such as the Andes 
result from deep melting as 
water dragged down during 
subduction fluxes into the crust. 
Canales et al. show that the Juan 
de Fuca slab, which is subduct- 
ing below the Pacific Northwest 
in North America, is much drier 
than other subducting slabs. The 
distribution of water in the slab 
may help determine the origins 
of seismic tremor and episodic 
slip that occur in this region. It 
also confirms a hypothesis that 
volcanism in the region is not 
the result of the influence of 
water, but rather is due to the 
decompression trigger melting 
more commonly seen along mid- 
ocean ridges. —BG 

Nat. Geosci. 10, 864-870 (2017). 


MOLECULAR BIOLOGY 
Time-out for mRNAs 
in the nucleus 


Cell cycle events are precisely 
orchestrated to ensure accurate 
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cell division. Yang et al. have 
discovered that sequestering 
mature mRNAs in the nucleus 
modulates cell cycle players. 

In dividing Arabidopsis cells, 
nuclear retention of CDC20 
and CCS52B mRNAs prevents 
them from being released into 
the cytoplasm until the nuclear 
envelope breaks down at pro- 
metaphase. Released mRNAs 
are rapidly translated into pro- 
teins, ensuring their regulatory 
functions at the proper cell cycle 
stage. Similar nuclear sequestra- 
tion strategies may be used for 
other mRNAs in different cellular 
contexts. —SYM 

Mol. Cell. 10.1016/j.molcel.2017.11.008 

(2017). 


NEUROSCIENCE 
Serious damage 
by soluble tau 


Alterations in the metabolism of 
the neuronal microtubule-asso- 
ciated protein tau are central 

to several neurodegenerative 
diseases. In these diseases, 

tau usually loses solubility and 
forms aggregates that impair cell 


sciencemag.org SCIENCE 
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function to trigger neuronal cell 
death and neurodegeneration. 
However, the in vivo neurotoxic 
potential of soluble tau is not 
yet fully understood. Bolds et al. 
stereotactically injected human 
soluble tau into the dentate 
gyrus of mice. Hippocampal 
granule neurons showed 
markedly reduced synapse 
numbers in the molecular layer. 
In addition, newborn granule 
cells showed reduced numbers 
of dendritic spines. Behaviorally, 
these animals exhibited an 
impaired capacity to perform 
pattern separation. Soluble tau 
thus causes long-term damage 
to the morphology and connec- 
tivity of newborn granule cells. 
—PRS 


Transl. Psych. 10.1038/s41398-017- 
0013-6 (2017). 


Lighting up riboswitching 
RNAs fold as they are synthe- 
sized, and this folding is required 
for function. Uhm et al. describe 
a single-molecule fluorescence 
energy transfer assay to monitor 
cotranscriptional RNA folding. 


SCIENCE sciencemag.org 


This approach revealed folding 
in the thiamine pyrophosphate 
(TPP) riboswitch that regulates 
translation of genes involved in 
the synthesis of thiamine, an 
essential vitamin. The riboswitch 
folds into the “off” conformation, 
in which translation is inhibited, 
even in the absence of the TPP 
ligand. If TPP is not bound to this 
off conformation, it can switch 
to the “on” conformation when 
transcription pauses near the 
translation start codon, and this 
allows translation to start. TPP 
binding stabilizes the off confor- 
mation and prevents the switch. 
The assay will allow investigation 
of other cases in which tran- 
scriptional speed and pausing 
affect RNA folding. —VV 

Proc. Natl. Acad. Sci. U.S.A.10.1073/ 

pnas.1712983115 (2017). 


The value of scaffolds 

The brain is built by groups 

of neurons that migrate and 
interdigitate to form layers and 
circuits. This process varies in 
different phyla of animals. Garcia- 
Moreno et al. draw lessons from 


How hunting affects 
brown bear populations 


n many parts of the world, regulated hunting 

is used to control the size of predator popula- 

tions such as wolves and brown bears. Bischof 

et al. explore how such regulated hunting 

affects the life history and demography of a 
brown bear population in Sweden that has been 
monitored continuously since 1985. The study 
shows that hunting was the leading cause of 
death for bears aged more than 3 years, resulting 
in reduced life expectancy; this contrasts with 
natural conditions, where mortality is reduced 
once bears reach adulthood. Hunting also sub- 
stantially reduces the reproductive value—that 
is, the number of future offspring that female 
bears of a given age are expected to have. Thus, 
even if a carnivore population recovers numeri- 
cally, regulated hunting transforms its makeup in 
multiple ways that need to be taken into account 
in management. —JFU 


Nat. Ecol. Evol. 2,116 (2018). 


Hunting is the leading cause of death for brown bears 
older than 3 years in Sweden. 


the development of the chick 
brain to understand what makes 
the mammalian brain distinctive. 
In mammals, excitatory glutama- 
tergic neurons born deep in the 
brain migrate radially to the cor- 
tex, whereas inhibitory GABAergic 
interneurons born elsewhere 
migrate tangentially across the 


cortex. And, like the external 
scaffolds on a building under 
construction, some glutamatergic 
neurons migrate tangentially, 
instruct organization, then 
disappear. The developing chick 
brain, although it has tangentially 
migrating interneurons, lacks the 
tangentially migrating transient 
neurons. —PJH 

Cell Rep. 22,96 (2018). 


Social skills 
to pay the bills 


Employment requiring high 
math skills but low social 
skills, including many sci- 
ence and engineering jobs, 
has decreased in the United 
States as high social skills have 
become increasingly powerful 
predictors of employment and 
wage growth. Using surveys of 
occupations, skills, and wages, 
Deming shows that socially 
skilled people self-select into 
less structured jobs requiring a 
wide range of tasks, leading to 
wage gains. Increasing com- 
puterization may be a driver, 
replacing routine work and 
prioritizing social collaboration, 
but employment and wages have 
been especially strong in jobs 
demanding both high math and 
high social skills. —BW 

Quart. J. Econ. 132, 1593 (2017). 


Neurons migrate to form surface layers of the chick brain. 
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CHEMICAL ENGINEERING 
Aplastic plan for 
organic synthesis 


The infrastructure for chemical 
synthesis typically lies at either 
end of a spectrum: small-scale 
studies in ad hoc assemblies of 
glassware or large-scale produc- 
tion in capital-intensive custom 
reactors. Kitson et al. report a 
hybrid protocol that customizes 
a blueprint for synthesis of a 
target compound in a series of 
interconnected plastic modules, 
which can be assembled en 
masse by 3D printing (see the 
Perspective by Hornung). The 
approach, demonstrated for the 
commercial muscle relaxant 
baclofen, establishes a system- 
atic workflow that is potentially 
amenable to automation: All that 
is necessary for synthesis and 
purification is the introduction of 
stock solutions and variation of 
temperature or pressure. —JSY 
Science, this issue p. 314; 
see also p.273 


MEMBRANE PROTEINS 
Making your way through 
the side of a barrel 


The mechanism of membrane 
insertion and assembly of 
§-barrel proteins is a central 
question of outer membrane 
biogenesis of mitochondria, 
chloroplasts, and Gram-negative 
bacteria. Hohr et al. developed 
assays to address this fundamen- 
tal problem. They systematically 
mapped precursor proteins 
transported by the mitochondrial 
Omp85 channel (Sam50) to 
elucidate the entire membrane 
insertion pathway of a precursor 
in the native membrane environ- 
ment. Their findings directly 
demonstrate translocation of 
precursor proteins through 
the lumen of the mitochondrial 
Omp85 channel, signal recog- 
nition by B-strand exchange 
between channel and precursor, 
and exit through the lateral gate 
into the membrane. —SMH 
Science, this issue p. 289 
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BIOPHYSICS 
Watching single 
molecules in motion 


Structural techniques such 

as x-ray crystallography and 
electron microscopy give insight 
into how macromolecules func- 
tion by providing snapshots of 
different conformational states. 
Function also depends on the 
path between those states, 

but to see that path involves 
watching single molecules move. 
This became possible with 

the advent of single-molecule 
Forster resonance energy trans- 
fer (SmFRET), which was first 
implemented in 1996. Lerner et 
al. review how smFRET has been 
used to study macromolecules 
in action, providing mechanistic 
insights into processes such 

as DNA repair, transcription, 
and translation. They also 
describe current limitations 

of the approach and suggest 
how future developments may 
expand the applications of 
smFRET. —VV 


Science, this issue p. 288 


MAGNETIC MATERIALS 
Boosting chiral 
nanoparticle responses 


Optical nanomaterials 

that combine chirality and 
magnetism are useful for 
magneto-optics and as chiral 
catalysts. Although chiral 
inorganic nanostructures can 
exhibit high circular dichroism, 
modulating this optical activity 
has usually required irrevers- 
ible chemical changes. Yeom et 
al. synthesized paramagnetic 
cobalt oxide (Co,0,) nanopar- 
ticles with L- and D-cysteine 
surface ligands. These ligands 
created chiral distortions of 
the crystal lattices, and this 
anisotropy led to much stronger 
chiroptical activity. The circular 
dichroism in the ultraviolet 

of nanoparticle gels could be 
modulated with magnetic fields 
of ~1.5 tesla. —PDS 


Science, this issue p. 309 
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BIOGEOGRAPHY 
Aglobal map 
of soil bacteria 


Soil bacteria play key roles in 
regulating terrestrial carbon 
dynamics, nutrient cycles, and 
plant productivity. However, the 
natural histories and distribu- 
tions of these organisms remain 
largely undocumented. Delgado- 
Baquerizo et al. provide a survey 
of the dominant bacterial taxa 
found around the world. In soil 
collections from six continents, 
they found that only 2% of 
bacterial taxa account for nearly 
half of the soil bacterial com- 
munities across the globe. These 
dominant taxa could be clus- 
tered into ecological groups of 
co-occurring bacteria that share 
habitat preferences. The findings 
will allow for a more predictive 
understanding of soil bacterial 
diversity and distribution. -AMS 


Science, this issue p. 320 


MOLECULAR BIOLOGY 
Substrate recognition 
by Dicer elucidated 


The Dicer protein generates short 
RNAs from double-stranded 
RNA (dsRNA) substrates and 
is critical for RNA interference 
and antiviral defense. Sinha et al. 
report structures of a Drosophila 
Dicer protein that shed light on 
its two distinct mechanisms for 
recognizing and cleaving sub- 
strates: adenosine triphosphate 
(ATP)-independent, distributive 
cleavage of 3'-overhang dsRNAs 
and ATP-dependent, processive 
threading of blunt-end dsRNAs. 
This flexibility might provide 
invertebrates with the optimi- 
zation capabilities needed for 
antiviral defense. —SYM 

Science, this issue p. 329 


CHEMICAL BIOLOGY 
Anaturally 
modified cellulose 


Cellulose is the most abundant 
biopolymer on Earth and an 


Published by AAAS 


important component of bacte- 
rial biofilms. Thongsomboon et 
al. used solid-state nuclear mag- 
netic resonance spectroscopy 
to identify a naturally derived, 
chemically modified cellulose, 
phosphoethanolamine cellulose 
(see the Perspective by Galperin 
and Shalaeva). They went on to 
identify the genetic basis and 
molecular signaling involved in 
introducing this modification in 
bacteria, which regulates biofilm 
matrix architecture and function. 
This discovery has implications 
for understanding bacterial bio- 
films and for the generation of 
new cellulosic materials. —SYM 
Science, this issue p. 334; 
see also p.276 


VASCULAR BIOLOGY 
Processing microRNAs 
for blood vessels 


Patients with hereditary hemor- 
rhagic telangiectasia (HHT) 
are prone to hemorrhages and 
nose bleeds. This is usually (but 
not always) because of mutant 
proteins in a signaling pathway 
that regulates blood vessel 
formation. Jiang et al. found 
that zebrafish or mice deficient 
in the microRNA processing 
enzyme Drosha had vascular 
defects similar to those found in 
HHT patients. Rare mutations in 
DROSHA were overrepresented 
in HHT patients who lacked 
the typical disease-associated 
mutations. Two of these mutants 
showed reduced activity and 
could not rescue the vascular 
phenotypes of Drosha-deficient 
zebrafish. -WW 

Sci. Signal. 10, eaan6831 (2018). 


THYMUS 
Regeneration circuits 
in the thymus 


Chemotherapy and radiation 
treatments in cancer patients 
damage a number of tissues and 
organs, including the thymus. 
Prolonged thymic damage 

can lead to T cell deficiency 


sciencemag.org SCIENCE 


and increased susceptibility to 
opportunistic infections and 


malignancie 


s. Wertheimer et al. 


examined thymic regeneration in 
mice after sublethal total body 
radiation. They found a critical 
role for bone morphogenetic 
protein 4 (BMPA) signaling in 


thymic rege 


neration. Endothelial 


cells provided a critical source of 
BMP4, which induces expression 
of the transcription factor FOXN1 
in thymic epithelial cells to pro- 
mote thymic regeneration. —AB 
Sci. /mmunol.3, eaal2736 (2018). 
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Toward dynamic structural biology: 
Two decades of single-molecule 
Forster resonance energy transfer 


Eitan Lerner,* Thorben Cordes,* Antonino Ingargiola, Yazan Alhadid, 
SangYoon Chung, Xavier Michalet, Shimon Weisst{ 


BACKGROUND: Biomolecular mechanisms 
are typically inferred from static structural 
“snapshots” obtained by x-ray crystallography, 
nuclear magnetic resonance (NMR) spectros- 
copy, and cryo-electron microscopy (cryo-EM). 
In these approaches, mechanisms have to be 
validated using additional information from 
established biochemical and biophysical as- 
says. However, linking conformational states 
to biochemical function requires the ability 
to resolve structural dynamics, as macromo- 
lecular structure can be intrinsically dynamic 
or altered upon ligand binding. Single-molecule 
Forster resonance energy transfer (SmFRET) 
paved the way for studying such structural dy- 
namics under biologically relevant conditions. 
Since its first implementation in 1996, smFRET 
experiments both confirmed previous hypothe- 
ses and discovered new fundamental biological 
mechanisms relevant for DNA maintenance, 
replication and transcription, translation, pro- 
tein folding, enzymatic function, and membrane 
transport. We review the evolution of smFRET as 
a key tool for “dynamic structural biology’ over 


FRET as a molecular ruler 


— Time 


Microscopy & 
spectroscopy 


the past 22 years and highlight the prospects for 
its use in applications such as biosensing, high- 
throughput screening, and molecular diagnostics. 


ADVANCES: FRET was first identified in the 
1920s by Cario, Franck, and Perrin. In the late 
1940s, Forster and Oppenheimer independently 
formulated a quantitative theory of the energy 
transfer between a pair of point dipoles. Stryer 
and Haugland verified this theory in the late 
1960s and coined the term “spectroscopic ruler” 
for FRET. Simultaneously, Hirschfeld, and later 
Moerner and Orrit, pioneered optical single- 
molecule detection methods leading to the 
first demonstration of smFRET in 1996. This 
breakthrough made it possible to study heter- 
ogeneous systems, dynamic processes, and tran- 
sient conformational changes on the nanometer 
scale. The smFRET technique was rapidly adopted 
by various research groups to provide mecha- 
nistic answers in diverse areas of biological 
research. In early pioneering applications of 
smFRET in biochemistry, Ha et al. visualized 
the conformational dynamics of the staphy- 


Dynamic structural biology 


Dynamic structural biology using smFRET. Left: Principle of FRET as a molecular ruler. In a 
system with a pair of dyes, after the donor dye (D) is excited, it transfers the excitation energy to a 
nearby acceptor dye (A; top) with an efficiency (E) that depends on the sixth power of the distance 
between the dyes (bottom). Right: Use of FRET to study structural dynamics at the single- 
macromolecule level. The experimental setup (top), a combination of single-molecule fluorescence 
microscopy and spectroscopy, can be used to determine conformational states or dynamics in 
solution or on immobilized molecules. Here E is calculated per each single-molecule burst of 
photons, and bursts (n) are accumulated in E histograms (middle) or for different time bins to form 


a single-molecule E trajectory (bottom). 
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lococcal nuclease enzyme; Deniz et al. obtained 
information on the structural dynamics of 
double-stranded DNA; and Zhuang et al. 
studied the conformation of individual RNA 
enzyme molecules and their folding dynamics 
in equilibrium. These pioneering studies were 
followed by others that used smFRET to unravel 
the inner workings of helicases and topoisom- 
erases, DNA replication, DNA repair, transcrip- 
tion, translation, enzymatic reactions, molecular 
motors, membrane proteins, nucleic acids, pro- 
tein and RNA folding, ribozyme catalysis, and 
many other molecular mechanisms. 


OUTLOOK: During the past two decades, 
smFRET has grown into a mature toolset with 

capabilities to explore 
dynamic structural biol- 
Read the full article O8Y for both equilibrium 
at http://dx.doi. and non-equilibrium reac- 
org/10.1126/ tions. The one-dimensional 
science.aanl133 (“ruler”) character of the 
FRET approach, however, 
only captures the complex three-dimensional 
structure of a system and needs to be comple- 
mented by other techniques that can provide 
additional information about the respective bio- 
chemical states of macromolecules. Approaches 
that explore smFRET combinations with other 
biophysical techniques (patch-clamp, optical, 
and magnetic tweezers; atomic force micros- 
copy; microfluidics) or photophysical effects are 
hence gaining attention. Although smFRET 
is particularly useful for the observation of 
dynamic conformational changes and sub- 
populations, FRET efficiencies also carry very 
precise information on the actual distance be- 
tween fluorophores attached to distinct moi- 
eties of a macromolecule. As shown by recent 
work from many laboratories (such as those of 
Seidel, Michaelis, Hugel, and Grubmiiller), this 
quantitative information can be used to help 
define biological structures and in the future 
should find a place in the protein database of 
molecular structures. smFRET has so far mostly 
been used for in vitro experiments but can be 
used additionally to monitor conformational 
dynamics and heterogeneity in live cells. “In 
vivo smFRET” has recently emerged as a prom- 
ising methodology, demonstrated by the groups 
of Sakon, Weninger, Schuler, and Kapanidis 
among others. We envision that further tech- 
nological developments will expand smFRET 
applications beyond dynamic structural biology 
to allow fast nonequilibrium kinetic studies, 
high-throughput drug screening, and molecu- 
lar diagnostics. Advancements of these appli- 
cations will be impactful for systems that are 
highly heterogeneous and dynamic. 


The list of author affiliations is available in the full article online. 
*These authors contributed equally to this work. 
{Corresponding author. Email: sweiss@chem.ucla.edu 

Cite this article as E. Lerner et al., Science 359, eaanl133 
(2018). DOI: 10.1126/science.aanl133 


lof1 


810z ‘g} Avenuer uo /Bio' Bewsouelossouel0s//:diy woody papeojuUMOGg 


RESEARCH 


BIOPHYSICS 


Toward dynamic structural biology: 
Two decades of single-molecule 
Forster resonance energy transfer 


Eitan Lerner,'* Thorben Cordes,”?* Antonino Ingargiola,’ Yazan Alhadid,' 
SangYoon Chung,’ Xavier Michalet,’ Shimon Weiss”**>+ 


Classical structural biology can only provide static snapshots of biomacromolecules. 
Single-molecule Forster resonance energy transfer (SsmFRET) paved the way for studying 
dynamics in macromolecular structures under biologically relevant conditions. Since its 
first implementation in 1996, smFRET experiments have confirmed previously 
hypothesized mechanisms and provided new insights into many fundamental biological 
processes, such as DNA maintenance and repair, transcription, translation, and membrane 
transport. We review 22 years of contributions of smFRET to our understanding of basic 
mechanisms in biochemistry, molecular biology, and structural biology. Additionally, 
building on current state-of-the-art implementations of smFRET, we highlight possible 
future directions for smFRET in applications such as biosensing, high-throughput 


screening, and molecular diagnostics. 


ntil the mid-1990s, insights in structural 
biology came mainly from static macro- 
molecular structures obtained by x-ray 
crystallography (J, 2). Nuclear magnetic 
resonance (NMR) spectroscopy allowed 
identification of many of the different structures 
associated with a single conformation of a bio- 
molecule (7). However, a biomolecule can adopt 
many different conformations. Cryo-electron mi- 
croscopy (cryo-EM) has recently complemented 
this toolkit, facilitating the determination of mul- 
tiple conformations of macromolecular structures 
in the ensemble with near-atomic resolution (3, 4). 
Molecular mechanisms can be inferred from such 
static structural “snapshots” and validated using 
biochemical and biophysical assays [e.g., (5)]. Al- 
though structural snapshots can identify distinct 
conformational states that macromolecules ex- 
plore at equilibrium (e.g., ligand-bound or un- 
bound, folded or unfolded), they lack information 
on the interconversion dynamics between these 
states. Understanding the functional roles of these 
structures requires a full dynamic picture (6-8). 
NMR, electron paramagnetic resonance (EPR) 
(9), and double electron-electron resonance (DEER) 
(0) spectroscopies, as well as fluorescence-based 
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techniques such as fluorescence anisotropy (J7), en- 
semble Forster resonance energy transfer (FRET, 
Fig. 1) (72), or photo-induced electron transfer (13), 
can provide access to dynamic information about 
biomolecular interactions and macromolecular 
conformations. The interpretation of experimental 
results from these techniques is, however, highly 
model-dependent (/4, 75). Even for two-state sys- 
tems in equilibrium (e.g., transitions between open 
and closed conformations of a protein, or bound 
and unbound states of interacting molecules; Fig. 2, 
A and B, respectively), ensemble methods yield lim- 
ited insight into structural and mechanistic details. 
This is because molecules in an ensemble undergo 
changes between conformational states asynchro- 
nously (Fig. 2C). This results in averaged-out sig- 
nals (Fig. 2C), so that the underlying dynamical 
information can be retrieved by model fitting only, 
and only in the simplest cases (16, 17). One way of 
solving the problem of asynchronicity is by mea- 
suring one molecule at a time and retrieving the 
underlying conformational states and dynamics 
directly (Fig. 2, D and E). 

Following the development of single-molecule 
detection techniques (18-24), the first demon- 
stration of FRET at the single-molecule level was 
published in 1996 (25). It suggested that single- 
molecule FRET (smFRET) could be used to study 
dynamic processes and identify transient confor- 
mations and interactions between macromolecules 
labeled with a donor-acceptor dye pair. Schiitz et al. 
used smFRET to monitor binding of ligands to 
streptavidin immobilized on phospholipid mem- 
branes (26), opening the way for similar experiments 
in live cells. In another pioneering application of 
smFRET, Ha et al. characterized the intricate con- 
formational and substrate-binding dynamics of 
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the staphylococcal nuclease enzyme (27). Further 
smFRET studies of conformational changes in 
other enzymes and in RNA molecules followed, 
using either diffusing molecules (Figs. 2D and 3A) 
(28, 29) or immobilized molecules (Figs. 2E and 
3C) (6, 27, 30, 31). In one example, Deniz et al. 
showed how information on distance-related dis- 
tributions can be derived from smFRET mea- 
surements of double-stranded DNA (dsDNA; Fig. 
3B) (28). Additionally, the combination of total 
internal reflection (TIR; Fig. 3, C and D) illumi- 
nation with immobilized single molecules allowed 
Zhuang et al. to follow the conformation of in- 
dividual RNA enzyme molecules and measure 
their folding dynamics at equilibrium (Fig. 3D) 
(6). In the two following decades, smFRET has 
matured into a toolkit to explore dynamic structur- 
al biology. This article reviews achievements in 
the use of smFRET to establish structure-function 
relationships and outlines challenges and pros- 
pects for the future. 


A brief historical overview of 
single-molecule FRET in biochemistry 
and molecular biology 


FRET was first identified in the 1920s by Cario, 
Franck, and Perrin. In the late 1940s, Forster and 
Oppenheimer independently formulated a quan- 
titative theory of the energy transfer between a 
pair of point dipoles. Stryer and Haugland verified 
this theory in the late 1960s and coined the term 
“spectroscopic ruler” for FRET. Around the same 
time that the effect of heterogeneity on FRET was 
taken into account in ensemble measurements 
(32), Hirschfeld pioneered single-molecule fluo- 
rescence detection (33). The first observations of 
individual fluorescent molecules in the late 1980s 
and early 1990s (34-38) were followed by an ex- 
plosion of studies, including imaging of complex 
biological systems such as molecular motors (39). 
Since its first demonstration in 1996, smFRET 
has been used to provide mechanistic answers 
in diverse areas of biological research. These stu- 
dies unraveled molecular mechanisms of heli- 
cases and topoisomerases (40), DNA replication, 
DNA repair (4D, transcription (42-44), translation 
(42, 45, 46), enzymatic function (47-49), molecular 
motors (50), membrane proteins (57), protein fold- 
ing (52, 53), nucleic acids (54, 55), RNA folding 
(54, 56, 57), and ribozyme catalysis (58, 59). Be- 
cause a short review cannot do justice to the 
large number and diversity of smFRET studies, 
we will discuss a few representative examples. 
The theory describing FRET is given in Box 1. 
A good example of the power of smFRET to 
explore heterogeneous mixtures and distinguish 
subpopulations of conformers can be taken from 
the field of bacterial transcription. Here, the molec- 
ular mechanism of the long-known but poorly 
understood abortive transcription initiation was 
deciphered by two concerted single-molecule ex- 
periments (60, 67), one of which was based on 
smFRET (60). Both studies showed that RNA 
polymerase (RNAP) repeatedly and unsuccess- 
fully attempts to reel the downstream DNA into 
its active site (using a mechanism called “DNA 
scrunching”) before clearing the promoter and 
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proceeding to transcript elongation (Fig. 4A) (60). 
By using distinct labeling schemes, the FRET study 
ruled out other proposed mechanisms (“inch- 
worming” and “transient excursion”; Fig. 4A). When 
the acceptor (A, red dot) labeled the promoter 
sequence and the donor (D, green dot) labeled 
the RNAP’s leading edge, no difference was ob- 
served between the FRET histograms of the 
RNAP-promoter open complex (RP,) and the 
RNAP-promoter complex transcribing up to 
seven bases (RPitc.<7; Fig. 4A, left). This excluded 
the inchworming model. When the acceptor labeled 
the DNA upstream of the promoter sequence and 
the donor labeled the RNAP’s trailing edge, again 
no difference could be observed between the FRET 
histograms of RP, and RPitc<7 (Fig. 4A, center). This 
excluded the transient excursion model. In a third 
experiment, the acceptor and donor labeled the 
DNA downstream and upstream relative to the 
promoter sequence, respectively (Fig. 4A, right). 
In this case, smFRET showed an increase in the 
long-distance fraction (small apparent FRET 
efficiency, E*) upon addition of nucleotides per- 
mitting transcription initiation. This data unam- 
biguously supported the scrunching mechanism, 
where DNA is reeled into the active site by RNAP 
during the initial stages of transcription, resulting 
in an increase in the size of the transcription bubble. 

Another good example of smFRET’s ability to 
disentangle conformational subpopulations is the 
study of the enzyme adenylate kinase (62, 63). 
Previous ensemble time-resolved FRET measure- 
ments had suggested a single conformational state 
characterized by a broad distance distribution for 
the enzyme in the absence of its substrates, aden- 
osine monophosphate and Mg-adenosine tri- 
phosphate (64). smFRET measurements showed, 
however, that at least two distinct dynamically 
interconverting conformations were present in 
the absence of substrates: an apo conformation 
and an active-like conformation (Fig. 4B) (62, 63). 
These and similar studies (47, 48) have shed light 
on how enzymes exist in different precatalytic 
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conformations and how substrate binding can 
stabilize one of these conformations. 

Another fruitful area of smFRET investigations 
is protein folding. The function of a protein is 
encoded in its three-dimensional (3D) structure. 
Although deduction of a protein’s tertiary struc- 
ture (its native conformation) from its primary 
sequence has been revolutionized by computa- 
tional techniques (65), smFRET experiments have 
provided many additional insights into the pro- 
cess of folding—whether into the correct structure 
or into incorrect structures (“misfolding”)—and 
characterization of possible folding intermedi- 
ates (66, 67). Measurements involving a variety 
of denaturing agents have yielded evidence of a 
monotonic shift in the mean FRET value of the 
unfolded subpopulation as a function of dena- 
turant concentration. These observations have 
been interpreted as a manifestation of rapid in- 
terconversion between the unfolded state and 
folding intermediates (29, 68), which would imply 
the existence of folding intermediates stabilized 
by non-native contacts (52, 53). Such studies have 
been expanded by many groups, and fast micro- 
fluidic mixers have enabled the extension of 
research from equilibrium to nonequilibrium 
regimes (69). The relevance of these in vitro fold- 
ing studies to in vivo chaperone-assisted protein 
folding or to cotranslational protein folding is a 
topic of current investigation (70). 

A related area of investigation to benefit from 
smFRET is the conformation of intrinsically dis- 
ordered proteins (IDPs). IDPs are often stabilized 
in a folded state upon ligand binding (i.e., co- 
folding). For example, o-synuclein (aSyn), a major 
determinant in Parkinson’s disease, is an IDP that 
co-folds upon binding to membranes. Deniz and 
co-workers studied the conformational changes 
of oSyn upon co-folding with different ligands 
and characterized its associated rapid conforma- 
tional dynamics (77). They found that oSyn gains 
different a-helical structures after binding to lipid- 
mimetic agents with varying surface curvature. 


wG@-raet+Q Siw 


molecular ruler 


Fig. 1. The concept of FRET. (A and B) An electromagnetic transmitter-receiver (A) is a 
macroscopic analog for the molecular dipole-dipole coulombic interaction between donor and 
acceptor (D and A) fluorophores (B). The dependence of the efficiency of energy transfer from D to A 
on their distance provides a molecular ruler with a high dynamic range on the 3- to 9-nm scale. 
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Similarly, smFRET helped to elucidate the con- 
formational dynamics, folding mechanisms, and 
function of RNA molecules (54-59). Not all genes 
code for proteins. These RNAs become functional 
upon folding into specific structures. Many such 
RNA molecules serve as ribozymes (RNA-based 
enzymes) or regulate various cellular processes 
such as gene expression and ribosome translation 
(54). The complexity of folding scales with the 
structural complexity of the RNA. The folding of 
RNA molecules goes through multiple free energy 
local minima separated by barriers of various 
heights (72). Therefore, it is easy for ribozymes 
to become trapped in long-lived, nonfunctional 
states (73). Even hairpin ribozymes, previously 
presumed to be “simple,” exhibit multiple inter- 
mediates and multiple pathways during folding 
(74, 75). 

These examples illustrate how smFRET can 
be used to study the conformational dynamics 
and function of biological macromolecules. Next, 
we consider different kinds of smFRET measure- 
ments, the type of data and analyses associated 
with them, and examples of the dynamics these 
methods are capable of exploring. 


Conformational states and 
their dynamics 


smFRET can be used to characterize distinct 
conformational states in macromolecules and 
the dynamics of their interconversion. However, 
transitions between states can only be measured 
if they occur over a time scale comparable to the 
technique’s temporal resolution. Transition time 
scales are proportional to the height of the acti- 
vation barrier between states (Fig. 5A). Separa- 
tion by a low barrier means rapid interconversion 
between states, which results in averaged-out 
smFRET data and indistinguishable states. Tran- 
sitions occurring over time scales much longer 
than the typical observation time will, of course, 
not be detected either. The temporal resolution 
of a smFRET experiment depends on several pa- 
rameters. One of the most important is whether 
the experiment involves freely diffusing molecules 
(Figs. 2D and 5B, left) or immobilized molecules 
(Figs. 2E and 5B, right). In both cases, FRET effi- 
ciency is calculated for each individual molecule 
over short, finite time intervals. For freely diffus- 
ing molecules, this observation period is set by 
the transit time of the diffusing molecule through 
the observation volume. These rare events gen- 
erate a “burst” of fluorescence photons with a 
typical duration on the order of 1 ms. A given 
molecule may or may not be detected again sub- 
sequently, depending on its random diffusion 
path. For immobilized molecules, observation 
can last for several seconds or even minutes, gen- 
erating time traces of fluorescence (or FRET effi- 
ciency, once processed) with a temporal resolution 
set by a combination of detector readout rate and 
signal level (a few milliseconds at best; Fig. 5B, 
right). Although improved organic dyes have been 
developed (76, 77), dye photobleaching remains the 
main constraint on the maximal observation time 
and temporal resolution (78). Analysis of burst 
(freely diffusing) or time-binned (immobilized) 
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Fig. 2. Principle and use of FRET for elucidating biomolecular reaction 
mechanisms and structural dynamics. (A to C) Principle of intra- 
molecular (A) and intermolecular (B) FRET assays and their readout (C) 

in single-molecule and bulk fluorescence (FI.) experiments. The bulk 
experiments always show an average value [i.e., donor (D) and acceptor (A) 
intensity of, e.g., hypothetical 50/50], whereas smFRET can determine 
(dynamically interconverting) states directly. The crystal structure overlay 
of substrate-binding domains of an ABC transporter in (A) shows open (red) 


data allows identification of distinct conforma- 
tional subpopulations, their FRET efficiency, and, 
in favorable cases, their interconversion rates. 
Studying slow conformational dynamics (from 
0.1 to 10 s) requires long observation times and is 
mostly done with immobilized molecules (Figs. 
2E and 3C). Here, the different durations (dwell 
times) spent by a molecule in each state are an- 
alyzed, and energy transfer efficiencies are either 
directly extracted or obtained via hidden Markov 
modeling or Bayesian statistical analyses (79, 80) 
(Fig. 5B, right). Results from many individual 
molecules observed in parallel are pooled to ob- 
tain statistically meaningful information. This ap- 
proach has been used to study the dynamics of 
nucleic acid-processing enzymes such as helicases 
(40), the complex molecular mechanism of trans- 
location of the ribosome (87), and HIV reverse 
transcriptase initiation (82), among many others. 
Extraction of these dynamical parameters would 
be very difficult using ensemble techniques. 
Faster conformational dynamics (10 ps to 0.1s) 
are typically best studied with diffusion-based 
smFRET (Figs. 2D and 3A). Here, dynamics can 
be extracted from fluctuations in FRET efficiency 
within single-molecule bursts or between con- 
secutive bursts of the same molecules (moving in 
and out of the observation volume several times) 
with accessible time scales in the range of 0.1 to 
10 ms. If diffusing molecules change conforma- 
tion during transit through the observation vol- 
ume, the time-averaged FRET efficiency within 
each burst is of little use, although it could hint at 
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the presence of faster dynamics (83, 84). Analytical 
methods to investigate such dynamics have been 
developed in recent years (85, 86). For instance, 
Torella et al. examined the short time scale var- 
jance of FRET efficiency within individual single- 
molecule bursts [burst variance analysis (BVA)] 
(86) (Fig. 5C, left). FRET variance exceeding that 
expected from photon-counting statistics (“shot 
noise”) was used to detect millisecond-time scale 
dynamics in complexes of the Klenow fragment 
of DNA polymerase. Using the same approach, 
Robb et al. showed that in transcription initiation, 
the transcription bubble (the DNA region opened 
up by RNAP) exhibits conformational dynamics 
on the submillisecond time scale (Fig. 5C, left) (87). 

Single molecules freely diffusing in 3D may 
reenter the observation volume several times 
before diffusing away permanently. This results 
in a series of consecutive single-molecule bursts, 
between which the molecule may change its con- 
formation. This opens the possibility of analyzing 
conformational dynamics by recurrence analysis 
of single particles (RASP; Fig. 5C, center) (88). 
Analyzing the succession of FRET efficiencies of 
consecutive bursts separated by variable recur- 
rence times enabled quantification of the fold- 
ing relaxation times of small proteins such as cold 
shock protein (Csp), spectrin R15, and the B do- 
main of protein A (BdpA), revealing time scales 
of 250 ms, 32 ms, and 0.7 ms, respectively (88). 

Faster conformational changes (<0.1 ms) yield 
single-molecule bursts with averaged-out FRET 
values. Approaches that do not rely on the anal- 
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and closed (green) conformations. (D and E) smFRET with diffusing molecules 
(D) or immobilized molecules (E) including accessible biophysical parameters 
(i.e., conformational states and dynamical changes). For characterization of 
conformational states, histograms of FRET efficiency E with frequency n 

are used; dynamics are directly seen via temporal evolution of E obtained 
via ratio of acceptor (A) fluorescence to fluorescence from both donor (D) and 
acceptor after donor excitation. [(A) and (D) adapted, with permission, 


ysis of separate single-molecule bursts, but rather 
on photon statistics within bursts, are therefore 
called for. In addition, because such rapid confor- 
mational changes include multiple transitions 
within each single-molecule burst, the variance 
of the FRET efficiency becomes noisy (in a way 
that resembles shot noise). In this limit, tech- 
niques that resolve FRET dynamics through 
variance analysis (such as BVA) cannot resolve 
faster FRET dynamics. For this regime, fluores- 
cence correlation spectroscopy (FCS) methods 
applied to smFRET (FRET-FCS) are the most 
straightforward to implement, even if demand- 
ing in terms of statistics (Fig. 5C, right). For 
instance, Nettels et al. performed diffusion-based 
smFRET measurements on Csp, acquiring data 
in order to compute correlation curves down to 
the picosecond time scale. Using this approach, 
they showed that the unfolded state of Csp 
undergoes structural reconfiguration within 
~40 ns (89). The additional information attained 
from fluorescence lifetimes has also been used in 
the analysis of rapid FRET dynamics. Fluores- 
cence lifetime analysis (using pulsed laser exci- 
tation) can also be used to unravel fast dynamics. 
Wozniak et al. used time-correlated single photon 
counting (TCSPC) to explore the bending dynam- 
ics of short dsDNA (90). Dolino et al. observed 
submillisecond dynamics in the ligand-binding 
domain of the N-methyl-p-aspartate receptor (97). 
Using alternating laser excitation on the nano- 
second time scale (nsALEX; see Box 1), Laurence et al. 
analyzed fluorescence decays of specific FRET 
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Fig. 3. Pioneering implementations of smFRET. (A) Schematic 

of a confocal microscope setup used for the acquisition of 
diffusion-based smFRET data; F(D) and F(A) indicate the donor 

and acceptor detection channels, respectively. (B) Example of data 
obtained with such a setup. The different histograms show the FRET 
efficiency distributions obtained for DNA samples differing by the 
distance between donor and acceptor labels; bp, base pairs. [Adapted, 


subpopulations to infer an effective distance dis- 
tribution for the folded and unfolded chemo- 
trypsin inhibitor 2 (CI2) (92). 

Although powerful, these fast conformational 
dynamics methods usually do not provide infor- 
mation on the exact number of conformational 
states (and their mean FRET efficiencies) in- 
volved in the identified dynamics. Recently, 
Pirchi et al. reported an analytical method to 
extract the values of these parameters by perform- 
ing a photon-by-photon hidden Markov model- 
ing analysis of smFRET experiments (H2MM) 
(93), as previously suggested by Gopich and 
Szabo (94). They were able to extract rate con- 
stants (ranging from ~10 us to ~1 s) and the mean 
FRET efficiencies of the corresponding states. 
We are therefore on a path toward full charac- 
terization of fast conformational dynamics of 
macromolecules: the number of states, their FRET 
values, and the interconversion rate constants. 

All of the methods mentioned above, although 
powerful, rely on a single reaction coordinate (the 
distance between a single donor-acceptor pair) 
and therefore provide a limited perspective on 
the underlying dynamics. We next discuss how 
multiple reaction coordinates can be simulta- 
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neously measured to untangle complex confor- 
mational dynamics. 


Toward multiple reaction coordinates 


A single smFRET measurement reports on a 
single distance within a macromolecular struc- 
ture, projecting a complex 3D structure onto a 
single 1D reaction coordinate (Fig. 5A). In some 
macromolecules, domains or subunits may be 
approximated as rigid bodies linked by flexible 
linkers or interacting through well-defined bind- 
ing interfaces. In other cases, allosteric ligand 
binding to one part of a macromolecule can 
cause conformational changes in other parts 
of the same macromolecule. In these cases, a 
single reaction coordinate may not be enough 
to report coordinated motions. Additionally, re- 
gardless of the presence or absence of allosteric 
binding and coordinated motion, some smFRET- 
derived single distances may be insensitive to 
conformational changes (e.g., structural change 
occurs tangential to the monitored distance or 
occurs in another part of the macromolecule). 
For all these reasons, it is generally desirable to 
study conformational changes with more than 
one set of positions for a pair of dyes. 
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with permission, from (28)] (©) Schematic of a total internal 

reflection fluorescence (TIRF) setup allowing the study of smFRET on 
surface-immobilized molecules. (D) Example of data obtained with 
such a setup, showing the real-time dynamics of RNA catalysis and 
folding. FRET trajectories were retrieved for individual RNA molecules 
(right) and histograms of dwell times reported on the time scale of the 
dynamics (lower left). [Adapted from (6)] 


An obvious solution is to label the macro- 
molecule with more than two dyes. Multicolor 
smFRET techniques (95) can indeed provide a 
wealth of information (Fig. 6A). Ha and co- 
workers (96) and Person et al. (97) used three- 
color smFRET to study Holliday junctions, which 
spontaneously switch between two distinct con- 
formations. They simultaneously determined 
three distances unambiguously and specified the 
correlated movement of the junction’s hairpin 
structure. Multicolor smFRET techniques provide 
high information content but are challenging to 
implement. They require multiple orthogonal and 
efficient site-specific labeling chemistries, elabo- 
rate optics, and data analysis techniques. Some of 
these difficulties can be mitigated by using a dark 
quencher as one of the (three) dyes. Because the 
dark quencher accepts excitation energy through 
FRET but does not emit photons, there is no need 
for detection of emission, thereby simplifying 
data collection. Kapanidis and co-workers (98) 
used this approach to monitor the binding and 
unbinding of a DNA polymerase to its substrate 
in real time, without the need for three-color 
detection for simultaneous detection of protein 
binding and associated conformational changes. 
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It is, however, also possible to monitor more 
than one distance using multiple identical dyes 
in a two-color excitation and detection scheme, 
using some photophysical tricks. In biomolecular 
complexes with more than one donor and accep- 
tor of the same kind, fluorophore interactions 
via FRET are highly complex, and the relation 
of FRET efficiency E to inter-dye distances R is 
generally nontrivial because of multiple ener- 
gy transfer pathways. For immobilized molecules, 
this multiplicity can be removed by using chem- 
ically induced stochastic blinking of the acceptor 
fluorophores (99, 100), leaving only one active 
acceptor per molecule for a brief period of time. 
Using this “photoswitchable FRET” approach, 
Uphoff et al. measured the distances between 
DNA and two residues on the catabolite activator 
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protein (CAP), as well as the strand exchange 
dynamics in Holliday junctions (700). Because the 
acceptor blinks randomly, smFRET time traces 
exhibit different values over time, allowing mea- 
surement of multiple distances from a single 
donor to multiple acceptor fluorophores (Fig. 6B). 

Another alternative to multicolor smFRET— 
which requires as many detection channels as 
there are different dyes—is using simple two-color 
smFRET with other independent observables 
(translational diffusion, fluorescence anisotropy, 
brightness, etc.) (JO2). For example, by using an 
alternating laser excitation (ALEX) scheme (see 
Box 1), Kapanidis et al. were able to simultaneously 
report FRET values E for each molecule as well as 
the “stoichiometry” S, defined as the ratio between 
fluorescence originating from donor excitation 


scrunching in initial transcription of RNA polymerase 
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Fig. 4. Typical examples of smFRET studies. (A) Transcription 
initiation involves a DNA scrunching mechanism. The results of three 
experiments differing by the location of the donor and acceptor 

dyes are shown (see text). The cartoons indicate which model is or is 
not compatible with the results. [Adapted from (60)] (B) Intrinsic 
domain motions between conformations in adenylate kinase (AK). The 
experiment tracks the distance between substrate-binding domains 
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and that originating from both donor and ac- 
ceptor excitations (102). Changes in mean S values 
report changes in the ratio of donor and acceptor 
brightnesses. ALEX can therefore distinguish 
molecules with different numbers of donor and 
acceptor dyes, or molecules with altered dye fluo- 
rescence quantum yields. In a recent implemen- 
tation of ALEX to simultaneously report two 
distances, smFRET was combined with protein- 
induced fluorescence enhancement (PIFE) (103, 104). 
One distance was between the dyes and within 
the FRET distance range (~3 to 9 nm); the other, 
between one dye and a bound protein, was in 
the shorter PIFE distance range (<3 nm). PIFE- 
FRET thus provided direct evidence for molecular 
coordination in the open transcription bubble 
(Fig. 6C) (104). Similarly, smFRET was combined 


decreased 
distance 


0.14 time/s 


(donor and acceptor dyes as green and red stars, respectively) in the 
AK enzyme in apo form (left histogram) and when bound to the 
substrate-mimicking inhibitor ApsA (right histogram). FRET efficiency 
histograms (left) and single-molecule time traces (right) show that in 
apo conformation, AK dynamically switches between two conformations, 
one of which is similar to the substrate-bound state. [Adapted, with 
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Fig. 5. Biomolecular dynamics accessible by smFRET. (A) Hypothetical 
energy landscape with Gibbs free energy projected onto a single reaction 
coordinate r showing different local minima (states) separated by 

energy barriers of different heights, giving rise to conformational 
transitions over different time scales. (B) smFRET data from diffusing 
molecules (bursts, left) and immobilized molecules (time traces, right) 
can be analyzed by various methods with differing temporal resolutions to 
study conformational transitions over different time scales. Conformational 
dynamics slower than ~0.1 s can be studied by analysis of single-molecule 


traces and dwell times in each FRET-associated state. (C) Examples of 
data analysis techniques using details of burst properties and photon 
statistics: Burst variance analysis (BVA) identifies bursts with variance 
of the FRET efficiency larger than expected from shot noise; recurrence 
analysis (RASP) identifies whether the FRET efficiency has changed 
between consecutive bursts of the same molecule; and correlation 
techniques identify time scales (including <100 us) at which fluorescence- 
related processes occur, including changes in FRET efficiency. [Reproduced, 
with permission, from (87, 88)] 


with photo-induced electron transfer (PET) (105), 
where the donor dye was quenched by a nearby 
tryptophan moiety. However, the steep depen- 
dence of PET on dye-quencher distance (ang- 
stroms) results in a binary output (contact/no 
contact) rather than a quantitative distance mea- 
surement. Finally, because both multicolor smFRET 
and ALEX achieve high information content, 
combining the two techniques doubles the in- 
formation content (both FRET and brightness 
ratio for each dye pair permutation). Lee et al. used 
three-color ALEX to monitor the translocation of 
bacterial RNAP on DNA on two distinct reaction 
coordinates (106). Such high information content 
can be used for multiplexed sorting in molecular 
diagnostics. Yim et al. have shown the capability to 
sort and quantify multiple different biomarkers in 
a four-color ALEX experiment (107). 

In each of these techniques, macromolecule 
labeling with fluorophores (or quenchers) is impor- 
tant. Although high-purity site-specific dye-labeled 
nucleic acids are now commercially available, 
preparation of site-specifically labeled proteins 
at multiple residues is far more challenging. 
Advances in this field [reviewed in (108, 109)] 
will allow the study of multiple reaction coordinates 
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and coordinated motions in single subunit pro- 
teins. As an alternative [already suggested in 1999 
(110)], smFRET can be combined with other single- 
molecule techniques not involving fluorescence. 
These include patch-clamp (J77) and single-molecule 
manipulation methods such as optical (772) and 
magnetic (173) tweezers, atomic force microscopy 
(114), and microfluidics and drag forces (115). These 
hybrid approaches are very powerful and simul- 
taneously measure multiple orthogonal reaction 
coordinates. A detailed account of these methods 
is outside the scope of this review and can be 
found elsewhere (95). 


Solving 3D structures with smFRET 


If properly calibrated, FRET efficiency E carries 
information on the precise distance between do- 
nor and acceptor dyes. Can this information be ex- 
tracted and used for 3D structure determination? 
X-ray crystallography, NMR, and cryo-EM are 
currently the gold standards for obtaining atomic- 
resolution 3D structures of complex macromole- 
cules. However, crystallization conditions may 
preferentially stabilize one conformation over 
others (62); in extreme cases, crystallization may 
even induce a structure never observed in solu- 
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tion, as detected by comparison with structures 
solved by solution-based techniques (116). Thus, 
structural characterization of macromolecules in 
solution and at ambient temperature is desirable. 

The ability to identify distinct conformational 
subpopulations can help in structure determina- 
tion, because relevant subpopulations can be 
identified and selected for further processing. 
Structure determination requires the preparation 
and measurement of multiple donor-acceptor 
variants labeling different pairs of positions on 
the macromolecule. The nontrivial transforma- 
tions from uncorrected FRET efficiency E* to 
corrected FRET efficiency £, and then to inter-dye 
distance R and to inter-residue distance 7, re- 
quire additional preparations and measurements 
of control mutants, modeling and simulations, 
structural convergence procedures, and control 
and validation of refined structures. Several studies 
(117-119) have followed this route, reporting suc- 
cessful structure determination (120-122). The in- 
formation retrieved from smFRET measurements 
of multiple distances can be used to directly tri- 
angulate a structure of a whole or part of a mac- 
romolecule, or can be used as experimental 
constraints for structural simulations (127). In 
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the latter approach, each iteration produces a 
structural snapshot. After assessing the dyes’ ac- 
cessible volumes via the “nanopositioning sys- 
tem” (NPS) approach (117), the computationally 
derived mean inter-dye distances are compared 
with the experimentally derived ones for all mea- 
sured constructs, and the sum of all deviations 
(cost function) is computed. This process, per- 
formed on a large library of simulated structural 
snapshots, should result in a subset of candidate 
conformations selected by minimization of the 
cost function. Using this approach, we recently 
identified two conformations of the transcription 
bubble in the bacterial RNAP-promoter open (RPo) 
complex (123). The set of distances of one con- 
formation agreed with the crystal structure of 
bacterial RPo (124), while the other did not. 
The latter conformation had characteristics 
of a scrunched transcription bubble, where a few 
bases from the duplex downstream to the bubble 
were reeled into the active site of RNAP and in- 
creased the size of the transcription bubble. 
Although successful structural determinations 
by smFRET have been reported, single-particle 
cryo-EM has also gained the ability to resolve 
several (up to three) conformational states in the 
frozen ensemble (4, 125, 126). Nonetheless, single- 
particle cryo-EM fundamentally lacks what smFRET 
readily provides: the ability to detect the dynamics 
of transitions between conformations. We anticipate 
that FRET-derived macromolecular structures or 
distance constraints will also be accepted in the 
future as entries in the Protein Data Bank. How- 
ever, different laboratories currently use different 
measurement and analysis techniques, different 
protocols, and different types of data files. There- 
fore, smFRET experiments—and more important, 
the control experiments and data analysis proce- 
dures required for obtaining exact distances— 
have to be standardized, as outlined below. 


Standardizing smFRET measurements 


Because of the challenges of smFRET data cali- 
bration, it is important to strive for reproducibility 
across laboratories by establishing standard pro- 
tocols and data-sharing practices. Such a stan- 
dardization effort, led by the Hugel and Seidel 
groups, was recently initiated through the wwPDB 
Hybrid/Integrative Methods Task Force (127). 
Equally important to this effort is a standard set 
of recommended practices that could be verified 
by peer review. These include (i) avoiding sub- 
jective selection of data sets (e.g., time traces in 
surface-immobilized experiments), (ii) requiring 
the donor-acceptor fraction of the labeled sample 
to be larger than 10%, (iii) using different exci- 
tation powers to assess photophysics effects, (iv) 
requiring fluorescence anisotropy measurements 
to characterize fluorophore rotational freedom, 
and (v) comparison with ensemble assays (dena- 
turation curve, enzymatic assay, secondary struc- 
ture content, thermal stability, ligand binding 
affinity, etc.) of the labeled macromolecule with 
its unlabeled counterpart to verify its activity and 
the relevance of the smFRET measurement. 

We also recommend that every smFRET ex- 
periment, including experiments with surface- 
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Fig. 6. smFRET-based approaches to study molecular coordination. (A) Multicolor smFRET 
studying coordinated movement of a Holliday junction via proximity ratio PR: donor-transmitter 
D-T (green trace), transmitter-acceptor T-A (black), and donor-acceptor D-A (red). [Adapted, with 
permission, from (97)] (B) Photoswitchable FRET relies on temporal separation of donor-acceptor 
interactions via photoswitching and isolation of molecular species with one distinct donor-acceptor 
pair at any given time point. [Adapted, with permission, from (100)] (C) PIFE-FRET uses a standard two- 
color assay with donor and acceptor (D-A) but adds information on protein binding via use of an 


environmentally sensitive donor (Cy3; Cy3B is used 
the environment). [Adapted, with permission, from 


immobilized molecules, should begin with a 
“control” solution-based smFRET assay, so as to 
determine (i) the quality of labeling, (ii) the num- 
ber of states or biochemical species resolved as 
distinct FRET subpopulations in the sample, (iii) 
the mean FRET efficiencies of the resolved sub- 
populations, and (iv) interconversion rate constants 
between subpopulations. With this information, 
analysis of smFRET time trajectories from surface- 
immobilized molecules can be guided by, and 
compared to, a statistically robust diffusion-based 
analysis. Moreover, this two-step process will allow 


assessing whether biomolecule-surface interac- 
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as the control dye that is insensitive to changes in 
104)] 


tions are present and perturb the system under 
study—in particular, measured FRET efficiencies, 
population frequencies, and time constants. 
Finally, to improve cross-checking and repro- 
ducibility, standardized data analysis protocols 
should be used and preferably based on open- 
source software. For instance, the FRETBursts 
open-source package provides a starting point 
for diffusion-based analysis (128), and similar 
packages are available for surface-immobilized 
smFRET (129, 130). The use of standardized file 
formats such as Photon-HDF5 (131) is a prerequi- 
site to making the raw data freely available and 
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Box 1. FRET as a spectroscopic ruler for macromolecular distances. 


FRET is a “spectroscopic ruler” with a distance-dependent efficiency E given by Forster theory 
(151) (Fig. 1). If R is the distance between two point dipoles [representing the center of the donor 
(D) and acceptor (A) fluorophores; Fig. 1A], E depends on the sixth power of the distance (Fig. 1B), 
assuming a “frozen” molecule (Eq. 1): 


= 1 
“Ty (RRO 


9 In(10) «2 
Ro = Bot 100 [F(R)en(A)AOR (2) 


The Forster radius, Ro, is the R at which E = 50%. Ro depends on parameters indicated in Eq. 2: 
Na is Avogadro's number, n is the refractive index in the medium between the donor and acceptor, op 
is the donor fluorescence quantum yield in the absence of acceptor, fp is the donor emission 
spectrum with its area normalized to 1, e, is the spectrum of molar extinction coefficient of the 
acceptor, and x? is the orientation factor of the dyes. The range of distances that can be accurately 
measured with FRET is 0.5Ro to 1.5Ro (for commonly used smFRET dye pairs, this translates into a 
dynamic range of ~3 to 9 nm). The parameters in Eq. 2 (R, op, «*, n) may dynamically change and 
therefore complicate the interpretation of smFRET distance measurements. Careful control exper- 
iments are therefore required. Note that E can be transformed into R if the dyes can be approximated 
by point dipoles. This approximation holds if the dye sizes are much smaller than R. Although 
smFRET has been demonstrated using quantum dots (152), they are too large to be approximated 
by point dipoles. Similarly, genetically encoded fluorescent proteins that are frequently used to mon- 
itor binding events and conformational changes in vivo (153) have chromophore groups that are 
bound deep inside their cores, complicating the transformation of E to R. Small and bright organic 
fluorophores are therefore the emitters of choice for smFRET measurements. 

The average E can be measured experimentally using several approaches. The most straight- 
forward way uses the donor mean fluorescence lifetimes (Eq. 3): 


where (tpa) and (tpo) are the donor mean fluorescence lifetimes in the presence or absence of 
an acceptor, respectively. Knowing the value of Ro of the dye pair, it is possible to deduce the mean 
distance between dyes using Eq. 1. (tpq) and (tpo) can be retrieved in a single measurement via 
nanosecond alternating laser excitation (nsALEX) (92) [also known as pulsed-interleaved excitation 
(PIE) (103)], in which donor and acceptor are alternately excited with pulsed lasers and fluorescence 
photons are collected using time-correlated single photon counting (TCSPC). 

A simpler approach extracts E from the donor and acceptor fluorescence intensities recorded 
using continuous-wave donor excitation, or with alternated donor and acceptor excitations [micro- 
second ALEX (103)] (Fig. 2, D and E; Eqs. 4 and 5): 


Ferret 
2) aa ee 
(3) Ea 


Feet = FS = IkFR = dir FR (5) 


where FR and FR are the background-corrected fluorescence intensities of the donor and the 
acceptor, respectively, measured during donor excitation; in the case of ALEX, FR is the background- 
corrected acceptor fluorescence intensity during acceptor excitation, /k is the donor fluorescence 
leakage into the acceptor detection channel, dir is the acceptor fluorescence when directly excited 
by the donor excitation laser, and y is the ratio between acceptor and donor fluorescence quantum 
yields and detection efficiencies. 

In ensemble FRET, the measured (E) reports on all molecules in all conformations; by contrast, in 
smFRET, (E) values (diffusing or immobilized formats, Fig. 5B) represent time-averaged FRET 
values over limited duration and/or limited number of molecules or events. During a single- 
molecule burst or time trace, the molecule might not visit all the states that define the system. 
The average of all (E) values for many single molecules and over a long enough observation will 
equal the ensemble-averaged (E). smFRET can distinguish between distinct subpopulations of 
(E) values, and each subpopulation may represent a distinct conformational state. However, if 
interconversion between conformational states takes place on time scales faster than the 
method's temporal resolution, (E) subpopulations may only represent time averages of these 
interconverting states. 
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preserved for the long term for independent vali- 
dations and future reanalysis with new methods. 
Depositing the raw data, analysis tools, and pipe- 
lines in public repositories such as Dryad, Data- 
verse, Zenodo, Figshare, or Github (728) will allow 
different groups to cross-validate results and ac- 
celerate the development of new analysis tools. 

So far, we have discussed past and present ap- 
plications of smFRET in biophysics, biochemistry, 
molecular biology, and structural biology. Future 
technological advances striving to overcome the 
current limitations of smFRET measurements 
could further extend the power of smFRET. Sev- 
eral areas of improvement can be envisioned: tem- 
poral resolution, extension to more in vivo and 
in vitro experimental formats, simplification, and 
higher throughput compatible with biopharma- 
ceutical applications. 


What is next for smFRET? 


smFRET has become the accepted method for 
dynamic structural biology but is still almost en- 
tirely used in the context of in vitro experiments. 
In vivo smFRET, which has recently emerged as 
a promising methodology requiring further de- 
velopment (43), may allow explorations of con- 
formational dynamics and heterogeneity in the 
living cell—an approach so far limited to bulk 
“in-cell NMR” (132) and “in-cell FCS” (133). By 
removing the artificial constraints of in vitro 
experiments, in vivo smFRET promises to shed 
light on outstanding questions in biology by mon- 
itoring smFRET as a function of location, diffu- 
sivity, and interactions with other partners, thereby 
illuminating the long-sought link between con- 
formational states and dynamics of biomolecules. 
Some of the challenges of in vivo smFRET mea- 
surements are the generally low signal-to-noise 
(S/N) and signal-to-background (S/B) ratios and 
poor photostability of fluorescence proteins. Or- 
ganic fluorophores are the probes of choice, but 
their use in live cells requires specific delivery 
and tagging protocols, which generally introduce 
larger perturbations and uncertainties than 
conventional molecular biology techniques. Using 
microinjection in cultured cells, Sakon and Weninger 
were the first to track folding of individual SNARE 
proteins (734). Recently, Schuler and co-workers 
also used microinjection and smFRET to probe 
the submicrosecond dynamics of individual freely 
diffusing, intrinsically disordered proteins in dif- 
ferent cellular compartments (Fig. 7, lower center) 
(70). The ability to distinguish between subpop- 
ulations while also detecting fast dynamics allows 
identification of different folding behaviors in the 
cytosol and the nucleus. While successfully used 
in these examples, microinjection relies on high- 
precision and low-throughput procedures. Tech- 
niques for internalization of labeled molecules, 
such as electroporation in bacteria and in yeast 
(135) or permeabilization using pore-forming 
reagents in mammalian cells (136), are more fea- 
sible. For instance, several studies have probed 
the conformations and localizations of doubly 
labeled oligonucleotides after microinjection in 
eukaryotic cells (137) or electroporation in bacteria 
(Fig. 7, lower left) (135). Simpler, robust delivery 
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Excitation 


Emerging applications 


Fig. 7. Emerging applications and future directions of smFRET. 

Top rows show current detector and excitation formats for smFRET; 
the bottom row shows emerging developments that go beyond existing 
capabilities. smFRET measurements have been demonstrated in live 
bacteria using TIRF with probes internalized via electroporation 

[left; adapted from (135)] and in eukaryotic cells using confocal 


strategies for better labeling yields and high cell 
viability are needed. 

Commonly used illumination geometries such 
as TIR or confocal imaging are not always ideal. 
In TIR excitation mode, only a thin layer (~100 nm) 
of the cell above the cover glass is illuminated by 
an evanescent field (Fig. 7, upper left). Confocal 
excitation (Fig. 7, upper center) allows observa- 
tion deeper into the cell while maintaining good 
S/N and S/B, but the diffraction-limited sampling 
volume requires raster scanning for image forma- 
tion, which competes with continuous recording 
at each location. Traditional wide-field epi- or trans- 
illumination is unsuitable for single-molecule detec- 
tion because of low S/N and S/B and high levels 
of photobleaching and phototoxicity. Light-sheet 
or single-plane illumination microscopy (SPIM) is 
more complex but enables 3D sectioning with 
high background rejection, limited phototoxicity, 
and bleaching, and has been successfully extended 
to single-molecule imaging (138). The use of SPIM 
for in vivo smFRET measurements could therefore 
provide new opportunities, as suggested by recent 
work using a simplified version (139). Combined 
with fast detectors [scientific-CMOS (sCMOS) 
cameras or single-photon avalanche diode (SPAD) 
arrays], probing fast biological events such as 
protein binding and conformational dynamics in 
live cells may become feasible (Fig. 7, right). 
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smFRET on immobilized molecules allows con- 
tinuous monitoring of conformations or binding 
events with fairly good temporal resolution. A 
potential drawback is the introduction of artificial 
perturbations due to the surface proximity and 
immobilization chemistry. One way to bypass this 
problem is to entrap individual molecules in im- 
mobilized lipid vesicles (140, 141). However, this 
technique limits the ability to modulate the local 
environment (for instance, by buffer exchange). 
Another solution, the anti-Brownian electrokinetic 
(ABEL) trap (42), counteracts the Brownian dif- 
fusion of a single molecule in solution by active 
modulations of an external electric field, but this 
requires observing one molecule at a time and re- 
sults in very low throughput. We note that in some 
cases, proteins and DNA molecules may gain dif- 
ferent structures or activities under such condi- 
tions. To overcome the need for immobilizing or 
trapping molecules, we envision confinement with- 
in a thin chamber (<100 nm) limiting the diffusion 
along the zg axis. Combined with fast detectors 
such as sCMOS cameras or SPAD arrays, this 
would enable tracking of multiple molecules for 
extended periods of time during their quasi-2D 
diffusion with reduced surface-interaction artifacts. 

A major drawback of diffusion-based smFRET 
measurements is the long acquisition time (sev- 
eral minutes) needed to accumulate a large num- 
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excitation and microinjected molecules [center; after (70)]. Multipixel 
SPADs (right) allow fast detection schemes and will allow retrieval of 
FRET trajectories of single molecules in vivo (scanning different z-layers 
via light-sheet microscopy) and in vitro (nonequilibrium kinetics via 
smFRET using mixers or continuous-flow microfluidic devices) 
[adapted from (143)]. 


ber of single-molecule bursts. Acquisition times on 
the order of a few seconds would enable an entirely 
new class of applications such as diffusion-based 
smFRET kinetic studies, high-throughput (HT) 
drug screening, and diagnostic assays (Fig. 7, 
lower right). Throughput can be multiplied by 
parallel acquisition from multiple excitation spots 
using SPAD arrays for detection (Fig. 7, right). This 
multispot approach provides a reduction in ac- 
quisition time proportional to the number of spots 
(143), which could potentially reach up to 1000 
pixels for next-generation SPAD arrays suitable for 
single-molecule detection (744). Although the exci- 
tation geometry can be multispot as well, more 
scalable excitation schemes include zero-mode 
waveguides or light-sheet illumination. Such schemes 
will eliminate the tedious task of aligning the 
excitation pattern to the detector pixels. Addition- 
ally, as noted above, such 2D illuminations allow 
the use of fast sCMOS cameras (>100 Hz full frame, 
>1 kHz partial frame) (745), which may be sufficient 
for some applications such as high-throughput 
screening that currently relies on SPADs or live-cell 
smFRET imaging (Fig. 7, lower right) (135, 139). 
Whereas ensemble kinetics can identify kinetic 
processes that are well separated in time, non- 
equilibrium smFRET kinetic studies can identify 
multiple conformations or binding states and 
their associated transitions. Non-equilibrium 
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smFRET kinetic studies rely on rapid exchange 
or mixing of reagents to initiate a perturbation 
in the system under study. In experiments on 
immobilized molecules, rapid exchange of con- 
ditions initiates the reaction, which is then recorded 
in as many time trajectories as there are molecules. 
Time-dependent FRET efficiency histograms that 
describe the reaction are computed by aligning 
all the smFRET trajectories (145). Non-equilibrium 
smFRET kinetic studies performed on diffusing 
molecules are more challenging because mole- 
cules randomly enter the observation volume, pre- 
venting continuous time traces to be acquired. 
Ingargiola et al. recently measured the kinetics 
of transcription initiation (promoter escape) 
using a multispot setup (143). They directly followed 
the kinetics by monitoring the conformation of 
the transcription bubble at the single-molecule 
level with 30-s temporal resolution, limited only 
by the number of single-molecule bursts detected 
across the eight spots (Fig. 7). A 48-spot system 
currently in development (146) should improve 
the temporal resolution accordingly. However, as 
the temporal resolution of the measurement is 
improved, faster mixing is required. A combination 
of a continuous-flow microfluidic mixer (147) to- 
gether with multispot detection could provide the 
solution for fast non-equilibrium smFRET kinetic 
studies of diffusing molecules (Fig. 7, lower right). 

Drug discovery using drug-ligand interactions 
measurements relies on high-throughput ensem- 
ble techniques to rapidly screen large libraries of 
small molecules for identification of interactions 
and quantification of affinities. Various screen- 
ing methods differ in the range of affinities they 
can measure, their throughput, sample con- 
sumption, accuracy, measurement modality (ki- 
netic, steady-state), possible requirement for 
immobilization, lowest binding stoichiometry, 
etc. Many of the techniques that allow high- 
throughput screening of more than 10* molecules 
per day report either quantitatively on low affinity 
ranges in bulk, or on interactions with immobilized 
small molecules measured by surface plasmon 
resonance (148). Diffusion-based smFRET assays 
could enable probing such interactions with min- 
imal sample consumption. However, until recently, 
such measurements required long acquisition 
times. Additionally, such screening requires an 
automated system that can rapidly exchange con- 
ditions. Kim et al. used a microfluidic mixing de- 
vice to automate titration from many input 
channels and perform serial smFRET measure- 
ments at different conditions (149). Multispot 
and multicolor smFRET in combination with an 
automated mixing device would allow highly mul- 
tiplexed smFRET measurements, suitable for high- 
throughput screening (Fig. 7). As an example, a 
1024-spot system would allow measurements 
lasting ~250 ms, translating to ~350,000 assays 
per day (assuming that the microfluidic chip en- 
ables dispensing of as many samples at this fre- 
quency). The same approach could be used to 
titrate binding components to produce affinity 
curves for each ligand down to picomolar con- 
centrations and in varying conditions. Alternatively, 
amultispot setup could be used in conjunction with 
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a titer plate and scanning stage, where many dif- 
ferent conditions in each well can be tested at a 
much higher rate than with a single-spot excita- 
tion. Lastly, using multispot smFRET acquisition 
in a stopped-flow format would allow measuring 
association and dissociation rate constants and 
extraction of molecular affinities (Fig. 7, right). 

Likewise, molecular diagnostics could benefit 
from high-throughput smFRET capabilities. Such 
applications require highly specific and sensitive 
molecular recognition of low-abundance molec- 
ular markers (proteins, self-antibodies, microRNAs, 
freely circulating DNA, etc.) in a small volume of 
bodily fluids, ideally without any amplification 
(107, 150). Here again, fast acquisition coupled 
with automation of liquid handling is required. 
A combination of multispot smFRET with multi- 
color ALEX capabilities and a microfluidic chip 
could provide a powerful molecular diagnostics 
platform. 


Conclusion 


Two decades after its introduction, the promise 
of smFRET has largely materialized; several var- 
iants have now reached maturity to form a robust 
and mainstream toolkit available to biochemists, 
molecular biologists, and biophysicists. Commer- 
cial systems implementing smFRET have been 
introduced in recent years. We anticipate further 
development of such systems into turnkey and, 
eventually, fully automated devices based on open- 
source and validated data analysis algorithms, 
which will lower the barrier of entry to this pow- 
erful technology and further help to disseminate 
the method. 
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INTRODUCTION: The outer membranes of 
Gram-negative bacteria, mitochondria, and chlo- 
roplasts characteristically contain 3-barrel mem- 
brane proteins. These proteins contain multiple 
amphipathic f strands that form a closed barrel. 
This arrangement exposes hydrophobic amino 
acid residues to the lipid phase of the mem- 
brane, with polar residues facing the lumen of 
the barrel. B-barrel proteins form outer mem- 
brane channels for protein import and export, 
and for metabolite and nutrient exchange. 

An essential step in the biogenesis of 8-barrel 
proteins is their insertion into the outer mem- 
brane. The B-barrel assembly machinery (BAM) 
of bacteria and the sorting and assembly machin- 
ery (SAM) of mitochondria are crucial for the 
membrane insertion of B-barrel precursors. The 
core subunits of these machineries, BamA and 
Sam50, are homologous 16-stranded f-barrel 


B Signal 


Cytosol 


Outer 
mitochondrial 
membrane 


Intermembrane 
space 


Loop 6 


proteins that belong to the outer membrane 
protein family 85 (Omp835). The B signal located 
in the last 6 strand of the precursor initiates 
protein insertion into the outer membrane; 
however, the molecular mechanism of f-barrel 
insertion has not been understood. Controversial 
models about the role of BAM and SAM have 
been discussed. These models either favor pre- 
cursor translocation into the BamA or Sam50 
barrel followed by lateral release through an 
opened f-barrel gate or suggest membrane 
thinning and precursor insertion at the BamA 
or Sam50 protein-lipid interface. 


RATIONALE: Structural studies have suggested 
that BamA and Sam50 harbor a dynamic lateral 
gate formed between B strands 1 and 16. In ad- 
dition, BamA and Sam50 have been proposed 
to induce a thinning of the lipid bilayer near 


Lateral Release 


Precursor 


B-Barrel protein insertion via the lateral gate of Sam5O. 8-Barrel precursors are transferred 
through the Sam50 interior to the lateral gate, which is formed by B strands 1 and 16. Upon gate 
opening, the B signal of the precursor substitutes for the endogenous Sam50 B signal. A conserved 
loop of Sam50 promotes £-signal binding to the gate and insertion of subsequent B hairpins. 
The folded f-barrel protein is released into the outer membrane. Po, polar amino acid residue; 
G, glycine; Hy, hydrophobic amino acid residue; C, C terminus; IRGF, binding motif. 
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the lateral gate. To determine the translocation 
pathway during B-barrel membrane insertion, 
we probed the proximity of B-barrel precur- 
sors (Tom40, Porl, VDAC1) to Sam50 in intact 
mitochondria of the model organism baker’s 
yeast, Saccharomyces cerevisiae. We engineered 
precursors and Sam50 variants with cysteine 
residues at defined positions and mapped the 
environment of precursors in transit by disulfide- 
bond scanning and cysteine-specific cross-linking. 


RESULTs: Our findings indicated that during 
transport of 8-barrel precursors by the SAM com- 
plex, the lateral gate of Sam50 between B strands 
1 and 16 was open and contained accumulated 

precursor. The f signal of 
the precursor specifically 
interacted with B strand 
1 of Sam50 and thus re- 
placed the endogenous 8 
signal (8 strand 16) of 
Sam50. Precursor trans- 
fer to the lateral gate occurred via the channel 
lumen of Sam50 and required the conserved 
loop 6 located in the channel. 8 hairpin-like 
elements consisting of two antiparallel 8 strands 
of the precursor were translocated and inserted 
into the lateral gate. The precursor remained 
associated with the Sam50 gate until the folded 
full-length B-barrel protein was released into 
the outer membrane. 


Read the full article 
at http://dx.doi. 
org/10.1126/ 
science.aah6834 


CONCLUSION: Our findings indicate that 
B-barrel precursors are inserted into the lumen 
of the Sam50 channel and are released into the 
mitochondrial outer membrane via the opened 
lateral gate of Sam50. The carboxy-terminal B 
signal of the precursor initiates opening of the 
gate by exchange with the endogenous Sam50 
6 signal. An increasing number of 8 hairpin-like 
loops of the precursor accumulate at the lateral 
gate. Upon folding at Sam50, the full-length 
B-barrel protein is laterally released into the 
outer membrane. Membrane thinning in the 
vicinity of the lateral gate likely facilitates in- 
sertion of the protein into the lipid bilayer. Thus, 
the membrane-insertion pathway of B-barrel 
proteins combines elements of both contro- 
versially discussed models: transport through 
the lumen of Sam50 and the lateral gate and 
subsequent insertion into the thinned mem- 
brane next to the gate. Owing to the conser- 
vation of both the B signal and Omp85 core 
machinery, we speculate that B-signal exchange, 
folding at the gate, and lateral release into the 
membrane represent a general mechanism for 
B-barrel protein biogenesis in mitochondria, 
chloroplasts, and Gram-negative bacteria. ® 
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The biogenesis of mitochondria, chloroplasts, and Gram-negative bacteria requires the 
insertion of B-barrel proteins into the outer membranes. Homologous Omp85 proteins are 
essential for membrane insertion of B-barrel precursors. It is unknown if precursors are 
threaded through the Omp85-channel interior and exit laterally or if they are translocated into 
the membrane at the Omp85-lipid interface. We have mapped the interaction of a precursor in 
transit with the mitochondrial Omp85-channel Sam50 in the native membrane environment. 
The precursor is translocated into the channel interior, interacts with an internal loop, and 
inserts into the lateral gate by B-signal exchange. Transport through the Omp85-channel 
interior followed by release through the lateral gate into the lipid phase may represent a basic 
mechanism for membrane insertion of B-barrel proteins. 


-Barrel proteins are of central importance 
in the outer membranes of mitochondria, 
chloroplasts, and Gram-negative bacteria. 
In eukaryotic cells, B-barrel proteins are 
essential for the communication between 
the double membrane-bound organelles and the 
rest of the cell. B-Barrel channels mediate the tran- 
slocation of a large number of metabolites and the 
import of organellar precursor proteins that are 
synthesized in the cytosol. The machineries for the 
biogenesis of B-barrel proteins have been identified 
in mitochondria and bacteria, termed sorting and 
assembly machinery (SAM) and f-barrel assembly 
machinery (BAM), respectively (J-6). The core com- 
ponent of the f-barrel insertion machinery is a 
member of the Omp85 superfamily, conserved from 
bacteria (BamA) to humans (Sam50, also called 
Tob55), whereas accessory BAM and SAM sub- 
units are not conserved (J, 2, 4, 5, 7-11). The most 
C-terminal f strand of each precursor serves as a 
signal recognized by the Omp85 machinery (12, 13), 
and the assembly of a B-barrel protein was shown 
to occur from the C terminus (J4). Upon closure 
of the barrel, the protein is released from the 
assembly machinery (/5). 
Members of the Omp85 superfamily form 
16-stranded 8 barrels, including BamA and Sam50, 
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the filamentous haemagglutinin secretion protein 
Fhac, and the translocation and assembly module 
TamA (14, 16-19). In the case of FhaC, a substrate 
protein was shown to be translocated across the 
bacterial outer membrane through the interior of 
the B-barrel channel (20). The substrates of BamA, 
Sam50, and TamA, however, have to be inserted 
into the lipid phase to become integral outer mem- 
brane proteins. High-resolution structures of BamA 
and TamA and disulfide scanning revealed a flex- 
ible interaction of the first and last B strand, sug- 
gesting a lateral opening of a B-barrel gate toward 
the membrane and a distortion of the adjacent 
membrane lipids (16, 18, 21-27). Different models 
have been discussed for the BamA-, Sam50-, and 
TamA-mediated insertion of B-barrel precursors 
into the outer membrane (5, 15, 16, 18, 21-38). In 
the BamA- and Sam50-assisted models, the pre- 
cursor is inserted at the protein-lipid interface; 
BamA or Sam50 creates a distortion and thin- 
ning of the membrane that favors spontaneous 
insertion of the precursor into the membrane. In 
the BamA- and Sam50-budding model, the precur- 
sor is threaded through the £-barrel interior of 
BamA or Sam50 and is laterally released through 
an opened lateral gate. The BamA structures, which 
were obtained in non-native environments and in 
the absence of precursor proteins (35), supported 
arguments for both models (16, 21-26), and thus 
the mechanism of f-barrel translocation by means 
of BAM or SAM is unknown. 


Lateral gate of the Sam50 f barrel in 
the mitochondrial outer membrane 


We developed a system to map the interaction 
of Sam50 with f-barrel precursors in transit in 
the native mitochondrial membrane environment. 
The f-barrel channel of Sam50 was modeled 
based on the BamA structures and cysteine and 
disulfide scanning of B strands 1 and 16 (Fig. 1, A 
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and B, and fig. S1, A to C) (39, 40). In the absence 
of precursor proteins, 8 strands 1 and 16 inter- 
acted, i.e., the putative lateral gate was closed 
(Fig. 1B and fig. SIC) (37). However, oxidation- 
induced disulfide formation between distinct cys- 
teines also revealed a sliding of B strands 1 and 16, 
ie., a dynamic behavior of the gate (27). To probe 
for the possible opening of the gate in the pres- 
ence of substrate, we tested B-barrel precursors 
that contained the B-hairpin mitochondrial target- 
ing signal (6) and imported them into isolated 
intact mitochondria, followed by position-specific 
SH cross-linking of 8 strands 1 and 16. The cross- 
linking reagent bismaleimidohexane (BMH) showed 
a high efficiency for stably linking strands 1 and 
16 in the absence of substrate (Fig. 1C, lane 2, and 
fig. SIC). A C-terminal fragment of the major mito- 
chondrial B-barrel protein porin (Por1), also called 
VDAC, including the Por! B signal, considerably 
disturbed the interaction of Sam50 f strands 1 
and 16 (Fig. 1C, lane 4), indicating that the Porl 
substrate interfered with gate closing. 


B-Signal exchange in the lateral gate and 
release of the full-length f-barrel precursor 


It has been speculated that the B signal may be 
specifically recognized by BamA and Sam50 by 
means of exchange of the endogenous BamA and 
Sam50 6 signal (31, 33), yet experimental demon- 
stration has been lacking (35). 8 Strand 16 of BamA 
and Sam50 functions as a B signal, and thus, in 
the exchange model, the B signal of the precursor, 
corresponding to the C-terminal B strand 19 of 
Porl, should interact with Sam50-B1. To test this 
hypothesis, we synthesized a radiolabeled [°s] 
Por! substrate carrying a single cysteine residue 
at distinct positions of the 8 signal. After import 
into mitochondria containing Sam50 with a single 
cysteine residue at different positions in B strands 
lor 16, we probed the proximity of the B strands by 
disulfide formation. The Porl signal indeed spe- 
cifically aligned with Sam50-f1 such that residues 
predicted to point toward either the channel inte- 
rior (black) or the lipid phase (gray) selectively 
interacted (Fig. 2A and fig. S2A). 

We performed several control experiments and 
obtained the following results: (i) The Porl B sig- 
nal selectively interacted with Sam50-B1, but not 
with Sam50-B16 (Fig. 2A and fig. S2A). (ii) To test 
a different B signal, we imported a ®°S-labeled 
C-terminal precursor of the mitochondrial import 
channel Tom40 and observed a comparable pair- 
ing with Sam50-B1 (fig. S2B). (iii) A precursor 
containing a mutant form of the Porl B signal 
[replacement of a conserved hydrophobic residue 
(13, 41] was strongly impaired in the interaction 
with Sam50-B1 (Fig. 2B). These results show that 
the B signal of precursors specifically interacts 
with Sam50-B1 (Fig. 2C). (iv) We analyzed sub- 
strates of different size, covering the range from 
5 to 18 B strands, and observed disulfide formation 
between the Porl B signal and Sam50-f1 in each 
case (Figs. 2A and 3A and fig. S2A). (iv) Comigra- 
tion of the differently sized Porl B-barrel pre- 
cursors with the SAM complex, as observed by 
blue native gel analysis (1, 3, 8, 9, 13), showed that 
each substrate accumulated at the SAM complex 
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Fig. 1. Intramolecular interaction between the first and last B strands 
of Sam50. (A) Model of the Sam50 6 barrel. Engineered disulfide bonds 
between the first and last B strands of Sam50 are indicated in red, where 
numbers in black indicate positions of cysteine residue. Additional disulfide 
bonds are possible because of the dynamic interaction of B strands 1 and 16 (27). 
IMS, intermembrane space; OM, outer membrane. (B) Yeast strains expressing 
cysteine-free Sam50 (Cfree) and Sam50 cysteine variants (containing exactly 


two cysteine residues, as indicated in green and blue) 


(Fig. 3, B and C). (v) Only the full-length Por! pre- 
cursor, corresponding to 19 f strands, was released 
from the SAM complex and assembled into the 
mature porin complex (Fig. 3, B and C) (42-45). 

Taken together, we conclude that the B signal 
of the precursor is bound by Sam50-f1 through 
exchange with the endogenous Sam50 B signal 
(B16) (Fig. 2C). Porin precursors of up to 18 B 
strands accumulate at the SAM complex, and only 
the full-size precursor is released into the lipid 
phase of the outer membrane. 


p-Barrel precursors interact with both 
sides of the Sam50 gate 


We asked if the substrate also interacted with 
B strand 16 of Sam50 and performed disulfide 
scanning between this B strand and the N-terminal 
region of the precursor, corresponding to 8 strand 
14 of mature Porl. We tested five distinct amino 
acid positions corresponding to Porl-f14 and 
observed disulfide formation with Sam50-B16 in 
each case (Fig. 4, A and B). However, the inter- 
action showed a considerably higher flexibility 
than that of the B signal of the precursor with 
Sam50-B1 (Fig. 2 and fig. $2). A Porl precursor 
with a mutant B signal strongly inhibited the 
interaction of the N-terminal precursor region 
with Sam50-B16 (fig. S3). Because the B signal 
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itself did not interact with Sam50-B16, this 
finding indicates that the specific binding of 
the B signal to Sam50-f1 is a prerequisite for the 
accumulation of the N-terminal precursor region 
at Sam50-B16. To provide further evidence that 
the precursor was intercalated between f strands 
1 and 16 of Sam50, we studied if it interacted 
with both strands simultaneously. Porl precur- 
sors containing two cysteine residues, one in the 
C-terminal 8 signal and one in the N-terminal 
region, were accumulated at Sam50, carrying a 
cysteine residue in B1 as well as in B16, and 
subjected to oxidation. In addition to the single 
disulfides formed (like in Figs. 2, A and B, and 4, 
A and B), we observed the formation of two di- 
sulfides simultaneously (Fig. 4C, lanes 3 and 7). 

Our results indicate that B-barrel precursors 
are inserted into a Sam50 gate formed between 
B strands 1 and 16. The C-terminal f signal spe- 
cifically exchanges with Sam50-B1, whereas the 
N-terminal region of the precursor undergoes 
a flexible interaction with Sam50-B16. 


Translocation of f-barrel precursors into 
the Sam50 channel 


The N-terminal region of the precursor (residues 
204 to 207) was also found in close proximity to 
the first residue (residue 126) of Sam50-f1 (Fig. 4, 
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the oxidant 4,4’-dipyridyl disulfide (4-DPS), followed by nonreducing SDS- 
polyacrylamide gel electrophoresis (PAGE), Western blotting, and immuno- 
decoration. Oxid., oxidized; red., reduced. (C) Isolated Sam50¢128/c4g0 
mitochondria were incubated with a Porl-precursor construct (8 strands 15 to 
19 corresponding to amino acid residues 210 to 283) and controls, as indicated. 
Samples were treated with the cross-linker BMH and analyzed as in (B). 
Quantification of cross-linking efficiency (mean + SEM, N = 3 for samples 1, 2, 
and 4; mean with range, N = 2 for sample 3). Sam5Ox, cross-linked Sam50. 


A and B). Sam50,¢5196 is positioned at the inter- 
membrane space opening of the Sam50 channel 
and predicted to point toward the channel in- 
terior (Fig. 1A). Porl,es297, Which is located toward 
the cytosolic side of mature Porl (42-44), was 
not only found in proximity to Sam50,es126 but 
also to further residues of Sam50-B1 that are 
predicted to face the channel interior (residues 
128 and 130) (Fig. 4A and fig. $3). Disulfide for- 
mation between the N-terminal region of Porl 
and Sam50-f1 was impaired when the Porl B 
signal was mutated (fig. $3). Thus, a functional 
C-terminal f signal is a prerequisite for the ob- 
served proximity of the N-terminal precursor region 
to Sam50-f1 (pairing between Sam50-f1 and the 
B signal involves hydrogen bonds of the poly- 
peptide backbone, and thus, cysteine side chains 
are available for disulfide formation). These find- 
ings are compatible with a model that, upon bind- 
ing of the f signal to Sam50-f1, the N-terminal 
region of the precursor is passing at the interior 
of Sam50-B1. 

To obtain independent evidence that B-barrel 
precursors are using the interior of the Sam50 
channel, we analyzed Sam50 £ strand 15 and 
compared residues predicted to face either the 
channel interior (black) or the lipid phase (gray) 
(Fig. 5A). A [?°S]Porl precursor with a single cysteine 
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Fig. 2. Interaction of Sam50 £ strand 1 with the C-terminal B signal 
of precursor proteins. (A) Radiolabeled Porl(B15 to 19) precursors 
containing one cysteine at the positions indicated were imported for 5 min 
into mitochondria isolated from yeast strains expressing Sam50 with 

the indicated cysteine residues, followed by oxidation with 4-DPS 

(lanes 1 to 12 and 19 to 36) or CuSO, (lanes 13 to 18). Samples were 
analyzed by nonreducing SDS-PAGE and autoradiography. Arrowheads, 
disulfide-bonded Sam50-Por1(B15 to 19) adducts. Schematic model, 
disulfide-bond formation of Sam50 8B strand 1 with the B signal (B19) of the 
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porin precursor B15 to 19; thick and thin lines indicate strong and weak 
formation of Sam50-Porl adducts, respectively. (B) [°°S]Porl(B14 to 19)c276, 
[°5S]Porl(B14 to 19)cogo, and the corresponding 8-signal mutants 
(L279A, leucine replaced with alanine) were incubated for 5 min with 
isolated mitochondria of Sam50 cysteine variants followed by 

oxidation with 4-DPS, nonreducing SDS-PAGE, and autoradiography. 
Arrowheads, cysteine-specific Sam5O-precursor adducts. 

(C) Schematic model illustrating the B-signal exchange observed in 

Fig. 2 and fig. S2. 
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complex. (A) [°°S]Porlc276 constructs of different lengths were incubated 
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followed by oxidation with 4-DPS, analysis by nonreducing SDS-PAGE, and 
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compensate for the strong intensity of full-length Porl. Arrowheads, 
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disulfide-bonded Sam50-precursor adducts; circles, Porle276 precursor. 
(B and C) Samples as described for (A) were analyzed by blue native 
electrophoresis and autoradiography. Lanes 1 to 4 versus lanes 5 to 12 of 
(B) have a linear adjustment of brightness and contrast to compensate 
for the strong intensity of Porl(B15 to 19)c276. POR, assembled porin 
complex; SAM-Porl, Porl precursor at SAM. 


residue in the N-terminal region (residue 205) 
was imported into Sam50 containing a single 
cysteine at different positions of either B strand 
15 or 16. In contrast to Sam50-B16, we did not 
observe disulfide formation between the pre- 
cursor and Sam50-B15 upon oxidation (fig. $4), 
indicating that Porl,e.295 was not so close to 
Sam50-B15 as to promote disulfide formation. By 
using SH-specific BMH, the precursor was cross- 
linked to Sam50-$15 and B16. Whereas the cross- 
linking occurred to various residues of Sam50-B16 
(comparable to the oxidation assay), only resi- 
dues of Sam50-B15 predicted to face the channel 
interior were cross-linked to the precursor (Fig. 
5B). To probe further regions of the precursor, 
we used the short amine-to-sulfhydryl cross-linking 
reagents N-o-maleimidoacet-oxysuccinimide ester 
(AMAS) and succinimidyl iodoacetate (SIA) to- 
gether with a cysteine-free Porl precursor and 
Sam50 containing a single cysteine residue in B15. 
Cysteine-specific cross-linking occurred only to 
Sam50-B15 residues predicted to face the chan- 
nel interior (Fig. 5C, arrowheads) (a larger non- 
specific band at 60 kDa was formed when no 
SH-group was available, i.e., also with cysteine- 
free Sam50). These results are fully compatible 
with the model that transfer of the Porl pre- 
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cursor involves the interior of the Sam50 channel, 
but do not fit to a model in which the Porl pre- 
cursor is inserted at the protein-lipid interphase 
without getting access to the channel. 


Sam50-loop 6 is required for B-signal 
binding 

In addition to the B-barrel channel, Sam50 pos- 
sesses two major characteristic elements, an N- 
terminal polypeptide transport-associated (POTRA) 
domain exposed to the intermembrane space 
and a highly conserved loop 6 that extends from 
the cytosolic side of the B barrel. Whereas bac- 
terial BamA proteins contain several POTRA 
domains that interact with B-barrel precursors 
and are crucial for precursor transfer from the 
periplasm into the outer membrane (17, 46-49), 
Sam50 contains a single POTRA domain that is 
not essential for cell viability (13, 50, 57). Disulfide 
formation between the Porl precursor and Sam50 
B strands 1 and 16 was not blocked in mitochon- 
dria lacking the entire POTRA domain (fig. $5). 
Together with blue native gel analysis (73, 45), this 
result indicates that the single POTRA domain 
is not crucial for precursor transfer to Sam50. 
The second element, loop 6, extends from the out- 
side, or cytosolic side, into the channel interior in 
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all Omp85 high-resolution structures analyzed 
(Fig. 6A) (16, 18, 21-25, 52). Deletion of Sam50- 
loop 6 was lethal to yeast cells. When wild-type 
Sam50 was depleted, expression of a Sam50 mu- 
tant form lacking the conserved segment of loop 6 
did not rescue growth and led to strong defects in 
the import of “S-labeled B-barrel precursors such 
as Porl and Tom40 into mitochondria (fig. S6, A 
and B). The steady-state amounts of 8-barrel pro- 
teins and various Tom proteins were decreased 
(fig. S6C). As the translocase of the outer mitochon- 
drial membrane (TOM complex) imports a large 
number of precursor proteins, this mutant did not 
permit a selective analysis of the function of loop 6. 
We thus generated point mutants of the conserved 
IRGF (Ile-Arg-Gly-Phe) motif of loop 6 (53, 54). 
Sam50p366a yeast, in which the Sam50 arginine 
at position 366 is replaced with an alanine, exhib- 
ited a temperature-sensitive growth phenotype on 
nonfermentable medium (fig. S7A). Mitochondria 
isolated upon growth of the mutant cells at per- 
missive temperature showed normal steady-state 
amounts of Sam, TOM and further control pro- 
teins (fig. $7, B and C). The import of *°S-labeled 
B-barrel precursors such as Porl, Mdm10, and 
Tom40 was strongly inhibited (Fig. 6B), whereas 
the import of matrix-targeted and intermembrane 
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Fig. 4. Interaction of Sam50 with the 
N-terminal 8 strand of precursor proteins. 
(A) [°°S]Porl(B14 to 19) precursors containing 
a single cysteine residue, as indicated, were 
imported into mitochondria isolated from yeast 
strains expressing the specified Sam50 cysteine 
variants, followed by oxidation with 4-DPS, 
nonreducing SDS-PAGE, and autoradiography. 
Black and white arrowheads show cysteine- 
specific disulfide-bonded Porl(B14 to 19) 
adducts to the C- and N-terminal B strand of 
Sam50, respectively. Right, schematic 

models. (B) [°°S]Porl(B14 to 19)c206 and 
[°5S]Porl(B14 to 19)c294 were treated as 
described in (A). (C) [°S]Porl(p14 to 19) single- 
and double-cysteine variants were incubated 
with isolated mitochondria from yeast 

strains expressing Sam5Octree or the double- 
cysteine variant Sam50c126/cago, followed 

by oxidation with 4-DPS. Samples were analyzed 
as described in (A). 


space-targeted precursors, which depend on the 
TOM complex but not on SAM, was not or only 
mildly affected (fig. S7D). The import of [*°S] 
Tom40 can be dissected into distinct stages by 
blue native gel analysis (J, 3, 8, 9). Sam50p366a 
mitochondria were impaired in the formation 
of SAM-bound intermediates (Fig. 6B). We con- 
clude that loop 6 of Sam50 is required for a stable 
interaction of the precursor with SAM. It has been 
reported that both Sam50 and Sam35 are needed 
for binding of a B-barrel precursor to the SAM 
complex (13). To directly test the contribution 
of loop 6, we performed affinity purification from 
lysed mitochondria using a purified B signal- 
fusion protein, leading to the copurification of 
Sam50 and Sam35 from wild-type mitochondria; 
amutant B signal did not pull down Sam50-Sam35 
(Fig. 6C) (73). The interaction of Sam50-Sam35 
with the B signal was strongly disturbed in 
Sam50p366a Mitochondria (Fig. 6C), demonstrat- 
ing that loop 6 is required for stable precursor 
binding to Sam50-Sam35. 


6 Hairpin-like transport of precursor 
proteins by Sam50 


To determine if a precursor in transit was in 
proximity to loop 6, [®°S]Por1 precursors with a 
single cysteine residue in the N-terminal region 
were imported into mitochondria containing 
Sam50 with a single cysteine residue in loop 6. 
By SH-specific cross-linking, the precursors were 
linked to residue 371 of loop 6 (Fig. 7A). A mu- 
tant B signal prevented cross-linking of the 
N-terminal precursor region to loop 6 (fig. S8A), 
whereas the signal itself was not found in 
proximity to loop 6 (fig. S8B, lanes 1 to 6), sup- 
porting our conclusion that a functional B signal 
is a prerequisite for further translocation steps 
of the precursor. It has been suggested that 
B-barrel precursors transported by SAM and BAM 
may be partially folded, such that 6 hairpins 
consisting of two adjacent B strands are formed 
(35, 55). We used distinct approaches to assess 
this view: (i) By using precursors of different 
lengths, covering 5, 6, 7, or 8 B strands of mature 


Hohr et al., Science 359, eaah6834 (2018) 19 January 2018 


A oxid. 


ke 
80- 
60 - 


kDa 
50 - 


IATTR 


Mito. ' 
sams0 Chree 126 127 128 129 130 Cree478 479 480 481 482 


80 - 
60 - 
kDa 
50 - 


13 14 15 16 17 18 19 20 21 22 23 24 


Moo, Chee 126127 128129 130.478 479-480-481 482 


80- | Eq =] 
60 - 
50 - 
kDa 


25 26 27 28 29 30 31 32 33 34 35 36 


B oxid. —_—_ 


Mito 
Sa m0 Cre 


fee 
60 - 


5 6 7 8 10 11 


Sagn80 Ciree126 127 128129130 jrec479480 481 482 484 


60 - 
kDa 
50 - 


13 14 15 16 17 18 19 20 21 22 23 24 


C oxid. Chee 126/480 Cire 126/480 S59 


810z ‘g} Avenuer uo /Bio' Bewsouelos eouel0s//:diyy wo papeojuMOg 


RESEARCH | RESEARCH ARTICLE 


Fig. 5. Interaction 
of B-barrel precursor 
with Sam50 resi- 
dues facing the 
channel interior. 

(A) Model of the 
Sam50 8 barrel. 

8 Strand 15, purple; 
open and filled 
circles, residues 
facing the interior of 
the barrel (black) 

or the lipid phase 
(gray), respectively. 
(B and C) Radio- 
abeled Porl precursor 
variants were imported 
into mitochondria of 
yeast strains express- 
ing the indicated 
Sam50 variants. 
Samples were cross- B 
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Porl, we found that only precursors corresponding 
to an even number of f strands were cross-linked 
to loop 6 (Fig. 7A and fig. S8B, lanes 7 to 30). (ii) We 
analyzed an internal precursor region that cor- 
responds to a B hairpin in mature Por] by inserting 
a pair of cysteine residues at the putative adjacent 
B strands and a tobacco etch virus (TEV) protease 
cleavage site at the predicted loop between the 
B strands. Upon import of the [*’S]precursor into 
mitochondria and lysis, TEV protease cleaved 
the precursor into two fragments (fig. S9A). 
When SH-specific cross-linking was performed 
before lysis, the fragments were not separated, 
demonstrating that the corresponding cysteines 
of the predicted adjacent B strands were indeed 
in close, hairpinlike proximity. (iii) We inserted 
single cysteine residues into precursor regions 
that correspond to cytosolic loops or intermem- 
brane space-exposed turns of mature Porl and 
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imported them into mitochondria containing a 
single cysteine in Sam50-loop 6 (summarized in 
Fig. 7B). The predicted most-C-terminal precur- 
sor loop was cross-linked to residue 369 of Sam50- 
loop 6, whereas the predicted most-N-terminal 
precursor loop was preferentially cross-linked 
to residue 371 (Fig. 7C and fig. S9B; precursors of 
different lengths and SH-specific cross-linkers 
with different spacer lengths yielded a compa- 
rable pattern). Cysteines inserted into the pre- 
dicted precursor turns were not cross-linked to 
Sam50-loop 6 (Fig. 7B and fig. S9C). (iv) The 
specific pairing of the C-terminal B signal of the 
precursor with Sam50-B1 (Fig. 2 and fig. S2) 
indicates that the B signal is likely in a B-strand 
conformation. These results suggest that B-barrel 
precursors interacting with Sam50 are not ina 
random conformation, but are partially folded 
and contain f hairpin-like elements. 
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Taken together, loop 6 of Sam50 is in prox- 
imity to the precursor in transit and plays a 
crucial role in B-barrel biogenesis. Thus, in con- 
trast to the POTRA domain, the functional im- 
portance of loop 6 in precursor transfer has been 
conserved from the bacterial Omp85 proteins FhaC 
and BamA (53, 54, 56) to Sam50. The analysis 
of precursor interaction with Sam50 supports the 
view that precursor insertion involves 8 hairpin- 
like conformations. 


Discussion 


We conclude that the biogenesis of mitochondrial 
B-barrel precursors involves the gate formed by 
the first and last B strands of Sam50. The analy- 
sis in the native mitochondrial system provides 
strong evidence for both the exchange model of 
B-signal recognition and the lateral release model 
of precursor exit through the Sam50 B-barrel gate 
(31, 33, 35, 36). Our findings suggest the following 
translocation path of a mitochondrial f-barrel 
precursor through SAM (Fig. 8). The precursor 
enters the interior of the Sam50 channel from 
the intermembrane space side in close proxim- 
ity to Sam50-B1. The C-terminal f signal of the 
precursor is specifically bound to Sam50-B1 by 
exchange with the endogenous Sam50 B signal 
(Sam50-B16), leading to an opening of the lateral 
gate. The conserved loop 6 of Sam50 is involved 
in precursor transfer to the lateral gate. More and 
more N-terminal portions of the precursor are 
threaded through the gate in close proximity 
to Sam50-B16. Upon translocation of the entire 
precursor polypeptide chain by Sam50, the full- 
length B barrel can be formed and released from 
the SAM complex (73). 

When comparing mitochondrial and bacterial 
B-barrel biogenesis, the pathways start in differ- 
ent locations (eukaryotic versus bacterial cytosol) 
and converge at the central Sam50 or BamA 8 
barrels. Three main stages can be distinguished: 
(i) Initial translocation into the intermembrane 
space and periplasm is mediated by nonrelated 
translocases—the TOM complex of the mitochon- 
drial outer membrane and the Sec complex of the 
bacterial plasma membrane (5, 6). (ii) Subsequent 
precursor transfer to the outer membrane is per- 
formed in part by related machineries, including 
intermembrane space and periplasmic chaperones 
and POTRA domains (46-49, 57-59). The bacterial 
transfer machinery is considerably more complex 
than that of mitochondria, likely reflecting the 
large number of bacterial B-barrel substrates (60). 
Bacteria use multiple POTRA domains and sev- 
eral periplasm-exposed Bam proteins (5, 15), 
whereas mitochondria contain a single non- 
essential POTRA domain and no accessory inter- 
membrane space-exposed proteins (13, 50). The 
two cytosol-exposed peripheral Sam proteins are 
involved in formation of a TOM-SAM super- 
complex (Sam37) and stabilization of the SAM- 
bound form of the precursor (Sam35) (9-11, 13, 39, 41). 
(iii) Finally, the membrane insertion process oc- 
curs by means of the highly conserved membrane- 
integral part of Sam50 and BamA. The f signal has 
been well conserved, and several examples were 
reported showing that the B signal is exchangeable 
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Fig. 6. Loop 6 of Sam50 is essential for B-barrel biogenesis. (A) Model 
of the Sam50 8 barrel indicating loop 6 in peach and the conserved IRGF 
motif at the tip of loop 6 in red. The positions of residues 369 and 371 used for 
cross-linking in Fig. 7 are indicated. (B) Assembly of full-length B-barrel 
precursor proteins [°°S]Porl, [?°S]Mdm10, and [2°S]Tom40 in wild-type (WT) 


and autoradiography. SAM-Mdml0 indicates the SAM-Mdm10 complex; 
SAM-la, SAM-Ib, and Int-Il are assembly intermediates of Tom40. (C) Immobilized 
glutathione-S-transferase (GST)—fusion proteins carrying the Porl B signal 
were incubated with digitonin-lysed WT and Sam5Or366 mitochondria. The 

B signal was released by thrombin protease cleavage, and eluates were 


and Sam5O0r366q Mitochondria was analyzed by blue native electrophoresis 


between bacteria, mitochondria, and chloroplasts 
(12, 13, 61), underscoring the conservation of ba- 
sic mechanisms of f-barrel biogenesis. B-Barrel 
proteins are anchored in the lipid phase by a 
hydrophobic belt; the diminished hydrophobic 
area near the Sam50 and BamA lateral gates is 
thought to cause a membrane thinning (J6, 27). 
In vitro studies on B-barrel membrane protein 
insertion demonstrate that membrane defects 
and BamA-mediated membrane distortion support 
membrane insertion (62-64). Sam50- and BamA- 
induced membrane thinning may contribute to 
B-barrel-membrane protein biogenesis in vivo by 
facilitating protein membrane insertion upon 
release from the SAM or BAM lateral gate. We 
propose that elements of both controversially 
discussed mechanisms, the budding model and 
assisted model, will be used in the lateral gate- 
sorting mechanism shown here. 

The large diversity of bacterial B-barrel pro- 
teins and the involvement of multiple POTRA 


domains and accessory Bam proteins (5, 15, 51, 60) 
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analyzed by SDS-PAGE a 


raise the possibility that additional precursor- 
specific folding pathways may complement the 
central mechanism of f-signal exchange and 
sorting by means of the lateral gate elucidated here. 
For example, assembly of oligomeric B barrels in 
bacteria might be stalled at the BAM complex 
until all subunits are assembled (65), similar to 
the arrest of shortened precursor constructs of 
monomeric B barrels (Fig. 3). We envision that pre- 
cursor insertion through the B-barrel channel 
and lateral gate demonstrated with mitochon- 
drial Sam50 represents a basic mechanism that 
can also be employed by B-barrel assembly machin- 
eries of bacteria and chloroplasts. 


Materials and methods 
Site-directed mutagenesis 


Mutagenesis was performed using the centro- 
meric plasmid pFL39 (66) containing the wild- 
type open reading frame of Saccharomyces 
cerevisiae SAM50, TOM40 or POR] and their 
corresponding native promoter and terminator 
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nd immunodecoration. Load 12.5%; elution 100%. 


sequences (table S1). Primers listed in table S2, 
containing the specific mutational changes, were 
used for PCR with the high fidelity polymerases 
KOD (Sigma-Aldrich) or Q5 (NEB). After DpnI 
(NEB) template digestion (3 hours at 37°C), PCR 
products were transformed into competent XL-1 Blue 
Escherichia coli cells (Stratagene). Plasmids were 
isolated by using the QIAprep Spin Miniprep Kit 
(Qiagene). Successful mutagenesis was confirmed 
by sequencing. 


Yeast strains and growth conditions 


Because SAM50 is an essential gene, the plasmid 
shuffling method was used to exchange SAM50 
wild-type with mutated versions of sam50 in a 
YPH499 background (67). The shuffling strain 
sam50A contains a chromosomal deletion of SAM50 
and expresses a wild-type copy of SAM50 on a 
YEp352 plasmid with a URA3 marker (7). After 
transformation of the centromeric TRPI plasmid 
pFL39 containing a mutated sam50 allele, pos- 
itive clones were selected on medium lacking 


7 of 12 


810z ‘g} Avenuer uo /Bio' Bewsouelossouel0s//:diyy wo papeojuMOGg 


RESEARCH | RESEARCH ARTICLE 


ABMH 


C BMH 


Mito. Sam50 Chee BOSSA 


kDa 
80 - 


Gal 


Fig. 7. B-Barrel precursors in transit are in close proximity to Sam50- 
loop 6. (A) [°°S]Porl(B14 to 19) precursors carrying a cysteine in B strand 
14 at position 205 or 206 and [°°S]Porl(B12 to 19) precursors carrying 

a cysteine in B strand 12 at position 180 or 179 were imported for 5 min 
into isolated mitochondria containing Sam50 variants, followed by 
cross-linking with BMH. Samples were analyzed by nonreducing SDS-PAGE 
and autoradiography. Arrowheads, cysteine-specific Sam50-Porl 


tryptophan. By growth on plates containing 
5-fluoroorotic acid (5-FOA) (Melford), cells that 
lost the URA3 plasmid expressing wild-type SAM50 
were selected. Subsequently, yeast cells were grown 
on nonfermentable medium containing glycerol to 
rule out the loss of mitochondrial DNA. At each 
step, plates were incubated at 23°C to minimize 
possible temperature sensitive growth defects. 
Yeast cells were cultured in liquid YPG medium 
(1% [w/v] yeast extract (Becton Dickinson), 2% 
[w/v] bacto peptone (Becton Dickinson), 3% [w/v] 
glycerol (Sigma), pH 5 - HCl (Roth)) at 23°C and 
shaking with 130 rpm. For growth tests, single 
yeast cells were picked and incubated overnight 
in 5 ml YPG. Cells corresponding to an ODg¢o of 
1 were taken from yeast strains indicated and 
resuspended in 1 ml autoclaved and distilled H,O. 
The suspension was further diluted by factors of 
1:10, 1:100, 1:1000 and 1:10,000. 3 or 5 ul were 
dropped on solid YPG (1% [w/v] yeast extract, 
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as described in (A). 


2% [w/v] bacto peptone, 3% [w/v] glycerol, 
2.5% [w/v] agar (Becton Dickinson)) and YPD 
(1% [w/v] yeast extract, 2% [w/v] bacto peptone, 
2% [w/v] glucose (Roth), 2.5% [w/v] agar). Plates 
were incubated at indicated temperatures. 
Yeast cells expressing Sam50 lacking loop 6 
(sam501oops) Aid not yield colonies after plas- 
mid shuffling. Therefore, the plasmid encoding 
Sam50aioops Was transformed into a YPH499 strain 
expressing SAM50 under the control of a ga- 
lactose promoter. After selection on galactose 
(Sigma-Aldrich) containing medium lacking tryp- 
tophan, the shutdown of SAM50 wild-type was 
performed by growth in liquid SL-medium (0.3% 
[w/v] yeast nitrogen base w/o amino acids (Becton 
Dickinson), 0.077% [w/v] complete supplement 
mix (-TRP) (MP biomedicals), 0.05% [w/v] NaCl 
(Roth), 0.05% [w/v] CaCl, (Roth), 0.06% [w/v] 
MgCl, (Roth), 0.1% [w/v] NH,Cl (Roth), 0.1% 
[w/v] KH2PO, (Roth), 0.6% [w/v] NaOH (Roth), 
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precursor cross-linking products. (B) Schematic model summarizing 

the crosslinking results of Fig. 7C and fig. S9, B and C. Black double- 
headed arrows, strong cross-links; dashed black double-headed arrow, 
weak cross-links; red X, no cross-links. (©) Radiolabeled Porl constructs 
were imported into mitochondria of yeast strains expressing the indicated 
Sam50 variants for 5 min. Samples were treated with BMH and analyzed 


2.2% [v/v] lactic acid (Roth), 0.05% [w/v] glucose) 
(11, 13, 68). Yeast cells were diluted approximately 
every 20 hours with fresh medium. Yeast strains 
are listed in table S3. 


Isolation of mitochondria 


Yeast cells were cultivated in YPG medium for 
2 days as a preculture. The main culture was in- 
oculated with the preculture and incubated for at 
least 15 hours with shaking at 130 rpm and 30°C. 
Yeast expressing Sam50,ioop6 Were grown in SL- 
Medium at 30°C for 42.5 hours to ensure proper 
shutdown of SAM50 wild-type. 

Yeast cells were harvested during log-phase 
by centrifugation at 1700 x g (maximal relative 
centrifugal force; 4000 rpm, H-12000 Thermo- 
Fisher Scientific) for 10 min at room temperature. 
Yeast cells were washed twice with distilled 
H,O, and incubated with 2 ml/g wet weight DTT 
buffer [100 mM Tris(hydrosymethyl)aminomethane 
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Fig. 8. Putative model Closed 
for sorting of 6-barrel 
precursors through gate 


the lateral gate of 
Sam50. £-Barrel pre- 
cursors are translo- 
cated from the 
intermembrane space 
side into the lumen of 
Sam50. The C-terminal 
8 signal of the precursor 
specifically binds to 

B strand 1 of Sam50 by 
replacing the endoge- 
nous B signal of Sam50 
6 strand 16). This 


Cytosol 


Precursor 


(Tris)/H2SO,4 (MP Biomedicals and Roth), pH 9.4, 
10 mM dithiothreitol (DTT, Roth)] for 20 min 
with shaking at 130 rpm and 30°C. Yeast cells 
were reisolated by centrifugation for 5 min at 
2700 x g (4000 rpm, SLA-3000 Sorvall) and 
incubated for 30 to 45 min in 6.5 ml/g wet weight 
Zymolyase buffer [16 mM KjHPO, (Roth), 4 mM 
KH.PO,, pH 7.4, 1.2 M sorbitol (Roth), 3 mg/ml 
Zymolyase 20T (Seikagaku Kaygyo Co.)] with 
shaking at 130 rpm and 30°C. The resulting sphe- 
roplasts were pelleted by centrifugation for 
5 min at 1500 x g (3000 rpm, SLA-3000 Sorvall) 
and washed with buffer (16 mM K,HPO,, 4 mM 
KH,PO,, pH 7.4, 1.2 M sorbitol). The pellet was 
resuspended in homogenization buffer (0.6 M 
sorbitol, 10 mM Tris/HCl, pH 7.4, 1 mM ethylene- 
diaminetetraacetic acid (EDTA, Calbiochem), 
0.4% [w/v] bovine serum albumin (Sigma), 1 mM 
phenylmethyl sulfonyl fluoride (PMSF, Sigma)). 
The spheroplasts were mechanically opened using 
a glass-Teflon potter by homogenizing the solu- 
tion 17 times on ice. Mitochondria were isolated 
using a four-centrifugation step procedure. To 
remove cell debris, the solution was spun for 
15 min, 1450 x g (3500 rpm, SS-34, Sorvall) at 
4°C, followed by a high speed spinning step to 
pellet mitochondria at 18,500 x g (12,500 rpm, 
SS-34, Sorvall) for 15 min at 4°C. The mitochon- 
drial pellet was resuspended in ice cold SEM 
buffer [250 mM sucrose (MP Biomedicals), 10 mM 
3-(N-morpholino)propanesulfonic acid (MOPS, 
Sigma), pH 7.2, 1 mM EDTA] containing 1 mM 
PMSF and both centrifugation steps were repeated. 
The mitochondrial pellet was resuspended in 
ice cold SEM and the protein concentration was 
determined using the Bradford protein assay 
(69). The concentration was adjusted to 10 mg 
mitochondrial protein per 1 ml SEM. Mitochon- 
dria were aliquoted, snap-frozen in liquid nitro- 
gen and stored at -80°C (70). 


Oxidation assays with whole yeast cells 


Yeast cells were grown overnight in YPG medium 
at 30°C. Cells corresponding to an OD¢o of 1 were 
taken and harvested by centrifugation for 10 min 
at 1500 x g (4000 rpm, FA-45-24-11, Eppendorf). 
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induces an opening of the lateral gate of Sam50 between B strands 1 and 16. 
Further strands of the precursor are inserted into the lateral gate in B hairpin— 
ike structures. Loop 6 of Sam50 promotes transfer of the precursor into 


Open lateral gate 


Release of folded 


‘B-Signal exchange B-Hairpin insertion 


Cells were resuspended in 100 ul buffer A [2 mM 
PMSF, 2x protease inhibitor w/o EDTA (Roche), 
1 mM EDTA] and oxidized by adding 0.2 mM 
4,4'-Dipyridyl disulfide (4-DPS, Sigma-Aldrich) 
(39). Cells were incubated on ice for 30 min fol- 
lowed by addition of 50 mM iodoacetamide (IA, 
Sigma) and further incubated for 15 min on ice. 
After addition of 60 mM NaOH, cells were cen- 
trifuged for 10 min at 1700 x g (4000 rpm, FA 
45-30-11, Eppendorf) and 4°C, resuspended in 
Laemmli buffer and heated to 65°C for 10 min 
shaking vigorously. 


Sam50 modeling 


Potential templates were identified with the 
HHPRED server restricting the search sequence 
to the Sam50 f-barrel domain (residues 125 to 
484) (71). The hidden-Markov model based homol- 
ogy search identified templates in the PDB with 
good p- and E-values. This included structures 
from FhaC [PDB code: 4QLO (51)] and TamA 
[PDB code: 4C00 (18)] as well as several struc- 
tures from BamA [PDB codes: 4K3B and 4K3C 
(16); 4C4V (21); 4N75 (22) and 5EKQ (23)], which 
exhibit considerable variations in the interaction 
between the $1 and B16 strands. A Sam50 model 
was calculated from each template using Model- 
ler (72). The model obtained from the BamA 
structure with PDB code 4N75 (22) with opti- 
mized alignment fit best to the experimental 
results of disulfide bonds and cross-link for- 
mation (model S1). 

Despite low sequence identity of 14%, the 
B-barrel model of Sam50 shows a very good 
agreement with the structure of BamA (PDB 
code: 4n75) with a core RMSD of 1.6 A (Ca-atoms 
of 310 out of 360 residues). Ramachandran anal- 
ysis using RAMPAGE (73) showed similar geo- 
metrical quality of the model compared to the 
template (favored/allowed/outlier residues, model: 
90.2% / 7.3% / 2.5% and template: 94.7% / 4.5% / 
0.8%). Also, the distribution of charged and aro- 
matic residues in respect to barrel inward and 
outward facing side chains agrees well between 
model and structure. In order to evaluate the 
position of loop 6, we superimposed the model 
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B-barrel protein 


the lateral gate. The full-length precursor is released from the lateral gate into 
the lipid phase of the outer membrane. Thinning of the membrane in proximity 
to the lateral gate facilitates membrane insertion of the B-barrel protein. 


with five BamA structures (PDB codes: 4K3B, 
4K3C, 4C4V, 4N75 and 5EKQ) as well as the 
TamA structure (PDB code: 4C00). They all 
show a highly similar overall structure for loop 6, 
with identical positions for the conserved IRGF 
motif including side chain orientations. IRGF 
faces the inside wall of the barrel (strands 13 to 
16). Noteworthy is for instance the interaction 
between the guanidino group of the motif’s ar- 
ginine residue with an aromatic side chain of 
B barrel strand 13. The Sam50 model agrees over- 
all with the structures of the loop and the position 
of IRGF side chains, for instance R366 is inter- 
acting with the aromatic ring of F413. Also, po- 
sitions and orientations of residues 369 to 371 in 
the Sam50 model agree with those of the afore- 
mentioned structures. In addition, the side chain 
orientations of the Sam50 8 signal (strand 16) 
toward either the B-barrel lumen or the lipid phase 
agree with the structure of the conserved B signal 
of mitochondrial VDAC/Porin (42-44). 

For graphical presentations, cysteine residues 
were included in silico at relevant positions and 
disulfide bonds formed using coot (74) before fig- 
ures were generated with Pymol (The PYMOL Mo- 
lecular Graphics System, Version 1.6 Schrédinger, 
LLC.). The Sam50 -barrel models were oriented 
according to the localization of the N-terminal 
POTRA domain in the mitochondrial intermem- 
brane space (13, 50). 


In vitro transcription and translation 


Plasmids containing the coding region of the 
gene of interest and carrying an upstream SP6 
promoter binding region were incubated with 
TNT SP6 quick coupled kit (Promega), an in vitro 
eukaryotic translation system based on rabbit 
reticulocytes, in the presence of [*°S]methionine 
(PerkinElmer). The reaction was incubated for at 
least 90 min at 25°C with shaking at 300 rpm. 
Reactions were stopped upon addition of 20 mM 
unlabeled methionine (Roth). A clarifying step was 
performed at 125,000 x g (45,000 rpm, TLA-55, 
Beckman) for 30 min at 4°C. 0.3 M sucrose was 
added to the supernatant and the lysate was snap- 
frozen and stored at -80°C. Successful transcription/ 
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translation was checked by SDS-PAGE and 
autoradiography. 

Template DNA of cysteine mutants of Porl 
and Tom40 constructs was generated by PCR 
using 2x REDTaq ReadyMix (Genaxxon). For- 
ward primers contained a RTS™ wheat germ 
kit (prime) specific 5'-CTTTAAGAAGGAGATA- 
TACC-3’ sequence upstream of the start codon. 
The corresponding reverse primers contained 
downstream of the stop codon a 5'-TGATGAT- 
GAGAACCCCCCCC-3’ wheat germ sequence. Cys- 
teine mutagenesis was performed using a primer 
encoding the desired mutation. Successful mu- 
tations were confirmed by sequencing. For en- 
hanced radiolabeling of the protein fragment, 
the methionine encoding sequence was added 
at the start codon and before the stop codon of 
the primers used for in vitro transcription. PCR 
products were analyzed by inspection of the DNA 
bands on 2% agarose (Biozym) gels. Products 
were purified using the QIAquick PCR Purifi- 
cation Kit (Qiagen). A consecutive PCR was per- 
formed according to the RTS™ wheat germ 
LinTempGenSet His,-tag (prime) manual using 
2x REDTaq ReadyMix. PCR products were pu- 
rified and concentrated using the MinElute PCR 
purification kit (Qiagen). Wheat germ lysate, 
an eukaryotic cell-free protein expression system 
based on wheat germ, was used as described in 
the RTS™ 100 wheat germ CECF Kit (5prime) 
with modification for radiolabeled lysates, includ- 
ing [S]methionine in the reaction solution and 
supplementation of unlabeled methionine with 
[*°S}methionine in the feeding solution. After in- 
cubation for 24 hours, lysates were clarified by 
centrifugation at 125,000 x g (45,000 rpm, TLA-55, 
Beckman) for 1 hour at 4°C. The supernatant was 
transferred to a fresh tube, snap-frozen in liquid 
nitrogen and stored at -80°C. Successful translation 
was checked by SDS-PAGE and autoradiography. 


In vitro import into mitochondria and 
cross-linking and oxidation 


Mitochondria were thawed on ice. For one import 
reaction, 50 ug mitochondria (protein amount) 
were resuspended in 100 ul import buffer (3% [w/v] 
bovine serum albumin, 250 mM sucrose, 80 mM 
KCl (Roth), 5 mM MgClo, 2 mM KH.PO,, 5 mM 
methionine (Sigma), 10 mM MOPS-KOH (Roth), 
pH 7.2) with 4 mM ATP (Roche), 4 mM NADH 
(Roth), 5 mM creatine phosphate (Roche) and 
0.1 mM creatine kinase (Roche). To deplete the 
membrane potential (-Ay), AVO (8 nM antimycin 
A (Sigma), 1 uM valinomycin (Sigma), 20 nM 
oligomycin (Sigma), final concentrations) was 
added and NADH was omitted (75). When Tim9 
was imported, bovine serum albumin was omit- 
ted. 4 to 10% [v/v] of rabbit reticulocyte lysate 
or wheat germ lysate containing the precursor 
proteins were incubated with mitochondria at 
25°C with shaking at 300 rpm. Membrane po- 
tential dependent import reactions were stopped 
by addition of AVO, before the import reac- 
tions were transferred on ice. The mitochondria 
were pelleted for 10 min at 4°C and 18,500 x g 
(13,200 rpm, FA 45-30-11, Eppendorf), the super- 
natant was discarded and the pellet was washed 
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with 100 «tl SEM. Mitochondria were resuspended 
in Laemmli buffer (0.2 M Tris, pH 6.8, 2% [w/v] 
dodecylsulfate-Na-salt (SDS, Serva), 4% [v/v] 
glycerol, 12.5% [w/v] bromphenol blue (Sigma), 
1mM PMSF, 50 mM iodoacetamide) and incu- 
bated for 10 min at 65°C shaking vigorously. 

In case of experiments combining protein im- 
port and cross-linking or oxidation, mitochondria 
were incubated in 100 ul SEM including energy 
mix (4 mM ATP, 4 mM NADH, 5 mM creatine 
phosphate, 0.1 mM creatine kinase) and 2 to 
6% [v/v] precursor-containing lysate was added. 
Import was conducted at 25°C for 5 to 30 min, 
shaking at 300 rpm, and the reactions were 
transferred on ice. To oxidize proteins, 0.36 mM 
4-DPS or 2.5 mM CuSO, (Roth) was added to 
the reaction. For cross-linking experiments, 1 mM 
cross-linking reagent 1,6-bismaleimidohexane (BMH, 
Thermo-Fisher Scientific), bismaleimidoethane 
(BMOE, Thermo-Fisher Scientific), 1,3-propane- 
diylbismethanethiosulfonat (M3M, Interchim), 
1,1-methanediylbismethanethiosulfonat, (M1M, 
Interchim), N-o-maleimidoacet-oxysuccinimide 
ester (AMAS, Thermo-Fisher Scientific) or succin- 
imidyl iodoacetate (SIA, Thermo-Fisher Scien- 
tific) were added from a 10 mM stock solution 
prepared in dimethyl sulfoxide (DMSO, Roth). 
Samples were gently mixed and incubated on ice 
for 30 min. Oxidation/cross-linking reactions were 
stopped by addition of 50 mM iodoacetamide 
and incubated on ice for 15 min. Reactions were 
laid on top of 500 ul S;9¢9EM (500 mM sucrose, 
10 mM MOPS, pH 7.2, 1 mM EDTA) and centrifuged 
for 15 min at 4°C and 20,800 x g (14,000 rpm, FA 
45-30-11, Eppendorf) for purification. The pellet 
was resuspended in 100 ul SEM and processed as 
described above. Samples analyzed on blue na- 
tive PAGE were resuspended in digitonin buffer 
(0.1 mM EDTA, 10% [v/v] glycerol, 50 mM NaCl, 
1mM PMSF, 20 mM Tris/HCl, pH 7.4, 1% [w/v] 
digitonin (Calbiochem)) and incubated on ice 
for 15 min before addition of blue native load- 
ing dye (0.5% [w/v] Coomassie blue G (Serva), 
50 mM 6-aminocaproic acid, 10 mM Bis-Tris/ 
HCl, pH 7). 


B-Signal affinity assay 

The method was performed as described in Kutik 
et al. (13). Briefly, E. coli cells expressing glutathione- 
S-transferase (GST), GST-B-signalp,.; and GST- 
B-signalporiresiq Were lysed and GST constructs 
were bound to glutathione sepharose 4B beads 
(GE Healthcare). Mitochondria were solubilized 
in GST buffer L (20 mM 4-(2-hydroxyethyl)-1- 
piperazineethanesulfonic acid (HEPES, Roth), 
100 mM potassium acetate (KOAc, Roth), 10 mM 
magnesium acetate (Mg(OAc)s, Roth), 10% [v/v] 
glycerol, 1% [w/v] digitonin (Calbiochem), 1 mM 
PMSF). After centrifugation for 10 min at 4°C and 
20,800 x g (14,000 rpm, FA 45-30-11, Eppendorf), 
the supernatant was transferred to GST bound 
sepharose beads and incubated for 30 min at 
4°C shaking end over end. Samples were cen- 
trifuged for 1 min at 4°C and 500 x g and washed 
at least seven times with GST buffer W (20 mM 
HEPES, 100 mM KOAc, 10 mM Mg(OAc)s, 10% [v/v] 
glycerol, 0.5% [w/v] digitonin). To cleave bound 
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proteins from GST, samples were incubated over- 
night at 4°C, shaking at 800 rpm, in 200 ul GST 
buffer T (20 mM HEPES, 100 mM KOAc, 10 mM 
Meg(OAc)s, 10% [v/v] glycerol, 0.5% [w/v] digito- 
nin, 2.56 mM CaCl, 80 units thrombin protease 
(Calbiochem)). Columns were centrifuged for 
1 min at 4°C and 500 x g. The flow-through was 
mixed with Laemmli buffer including 1% [v/v] 
B-mercaptoethanol (Roth) and heated to 95°C 
for 5 min. Samples were analyzed by SDS-PAGE. 


TEV protease cleavage assay 


In vitro import into mitochondria followed by 
cross-linking using BMOE was conducted as de- 
scribed above. After purification, samples were 
resuspended in solubilization buffer (20 mM 
Tris/HCl, pH 7.4, 0.1 mM EDTA, 50 mM NaCl) 
containing 6 M guanidinium hydrochloride (Roth). 
Samples were heated to 95°C and diluted 1:4 
in solubilization buffer. TEV protease (Thermo- 
Fisher Scientific) was added and incubated for 
30 min on ice. Samples were precipitated using 
14% [w/v] trichloracetic acid (TCA, Roth) and 
0.0125% [w/v] sodium deoxycholate (Sigma). 
Samples were resuspended in Laemmli buffer 
containing 1 mM PMSF and 10 mM IA and heated 
to 65°C for 10 min shaking vigorously. Samples 
were separated by 4: to 12% NuPAGE gels (Thermo- 
Fisher Scientific) according to the manufacturer’s 
protocol. 


Swelling assay 


To test the integrity of the mitochondrial outer 
membrane, 100 ug mitochondria (protein amount) 
were thawed on ice and resuspended in either 
100 ul hypotonic swelling buffer (1 mM EDTA, 
10 mM MOPS/KOH, pH 7.2) or isotonic SEM 
buffer. Mitochondria/mitoplasts were incubated 
on ice for 15 min before the indicated amount of 
proteinase K (Roche) was added. The samples 
were further incubated for 15 min on ice. Pro- 
teinase K was inactivated by addition of 2 mM 
PMSF and further incubated on ice for 10 min. 
Mitochondria/mitoplasts were pelleted and washed 
with SEM buffer including 1 mM PMSF. Sam- 
ples were resuspended in Laemmli buffer, includ- 
ing 1% [v/v] B-mercaptoethanol and 1 mM PMSF, 
and separated by SDS-PAGE. 


SDS-PAGE, NuPAGE, tris-tricine PAGE, 
blue native PAGE, and Western blotting 


SDS-PAGE was performed using 10% poly- 
acrylamide gels and SDS running buffer (25 mM 
Tris/HCl, pH 8.8, 191 mM glycine (MP Biomed- 
icals), 1% [w/v] SDS). Gels were run at 30 to 35 
mA for 3 to 5 hours. NuPAGE bis-tris pre-cast gels 
(10%, Thermo-Fisher Scientific) were used accord- 
ing to the manufacturer’s instructions. Tris-Tricine 
PAGE gels consistent of a 4: to 16% polyacrylamide 
gradient (48% [w/v] acrylamide (Roth), 1.5% [w/v] 
bisacrylamide (Serva)). Gels were run using anode 
buffer (0.2 M Tris/HCl, pH 8.9) and cathode buffer 
(0.1 M Tris, 0.1 M Tricine (Roth), 0.1% [w/v] SDS, 
PH 8.25) at 70 mA for 3 to 5 hours. For all above 
mentioned gel electrophoresis procedures, samples 
were resuspended in Laemmli buffer containing 
1mM PMSF, heated to 65°C for 10 min shaking 
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vigorously. When samples were cross-linked 
or oxidized, no DTT or B-mercaptoethanol was 
added but 50 mM iodoacetamide. 

Native protein complexes were analyzed using 
blue native PAGE (76). After import of radio- 
labeled proteins, mitochondria were resuspended 
in cold digitonin buffer (0.1 mM EDTA, 10% [v/v] 
glycerol, 50 mM NaCl, 1 mM PMSF, 20 mM Tris/ 
HCl, pH 7.4, 0.35 to 1% [w/v] digitonin) and 
incubated on ice for 15 min. Blue native loading 
dye (0.5% [w/v] Coomassie blue G (Serva), 50 mM 
6-aminocaproic acid (Sigma), 10 mM Bis/Tris 
(Roth), pH 7) was added. Samples were centri- 
fuged at 4°C for 15 min at 20,800 x g (14,000 rpm, 
FA 45-30-11, Eppendorf) and the supernatant was 
loaded on a 6 to 16.5% discontinuous gradient gel. 
8.5 cm gels were run in a cooled Hoefer SE600 
vertical electrophoresis chamber using anode buffer 
(50 mM Bis/Tris/HCl, pH 7) and cathode buffer 
(50 mM tricine, pH 7, 15 mM Bis/Tris, 0.02% [w/v] 
Coomassie G) at 90 mA and 600 V for 90 min. 

With the exception of blue native gels, gels con- 
taining radiolabeled samples were stained and 
fixed using staining buffer (30% [v/v] ethanol, 10% 
[v/v] acetic acid (Roth), 0.2% [w/v] Coomassie 
R250 (Roth)) followed by destaining with destain 
buffer (50% [v/v] methanol (Roth), 20% [v/v] acetic 
acid) until protein bands were clearly visible. Gels 
were dried onto Whatman paper (Macherey-Nagel) 
and exposed using PhosphorImager screens (GE 
Healtcare and Fuji), followed by autoradiographic 
detection (Storm PhosphorImager, GE Health- 
care; FLA9000, Fujifilm). 

When immunoblotting was performed, gels 
were incubated for 5 min in SDS running buffer 
after gel electrophoresis. Gel contents were trans- 
ferred onto PVDF membranes (Immobilon-P, 
Millipore) using standard semi dry Western blot- 
ting (77) at 250 mA for 2 hours using blotting 
buffer (20 mM Tris, 150 mM glycine, 0.02% [w/v] 
SDS, 20% [v/v] methanol). PVDF membranes 
were stained with staining buffer, destained using 
destain buffer until visible bands confirmed equal 
loading, and completely destained using 100% 
methanol. Blocking was performed for 1 hour 
using 5% [w/v] fat-free dried milk powder (Frema 
Reform) in TBST (200 mM Tris/HCl, pH 7.5, 
1.25 M CaClo, 0.1% [v/v] Tween20 (Sigma)) at 
room temperature. After washing in TBST, mem- 
branes were incubated with the designated pri- 
mary antibodies listed in table S4, overnight at 
4°C or for at least 1 hour at room temperature. 
After a second washing step in TBST, membranes 
were decorated with secondary antirabbit IgG 
antibody (Sigma), diluted 1:5000, that was cou- 
pled to horse radish peroxidase in 5% [w/v] fat- 
free dried milk powder in TBST for 1 hour. After 
washing a third time in TBST, membranes were 
incubated in ECL solution (GE Healthcare) and 
the chemiluminescence signal was detected by 
the LAS-4000 system (Fujifilm). 
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Genome-wide identification of 
interferon-sensitive mutations 
enables influenza vaccine design 


Yushen Du,”?* Li Xin,? Yuan Shi,’ Tian-Hao Zhang,’’* Nicholas C. Wu,* Lei Dai,’ 
Danyang Gong,’ Gurpreet Brar,' Sara Shu,’ Jiadi Luo,””** William Reiley,’ 
Yen-Wen Tseng,’ Hongyan Bai,® Ting-Ting Wu,’ Jieru Wang,”* 


Yuelong Shu,”* Ren Sun?”’** 


In conventional attenuated viral vaccines, immunogenicity is often suboptimal. Here we present 
a systematic approach for vaccine development that eliminates interferon (IFN)—modulating 
functions genome-wide while maintaining virus replication fitness. We applied a quantitative 
high-throughput genomics system to influenza A virus that simultaneously measured the 
replication fitness and IFN sensitivity of mutations across the entire genome. By incorporating 
eight IFN-sensitive mutations, we generated a hyper-interferon-sensitive (HIS) virus as a 
vaccine candidate. HIS virus is highly attenuated in IFN-competent hosts but able to induce 
transient IFN responses, elicits robust humoral and cellular immune responses, and provides 
protection against homologous and heterologous viral challenges. Our approach, which 
attenuates the virus and promotes immune responses concurrently, is broadly applicable for 


vaccine development against other pathogens. 


ost viruses adapt rapidly to diverse selec- 
tion pressures, posing a challenge for 
deploying safe and effective vaccines. In- 
fluenza viruses, for example, are charac- 
terized by large genetic diversity across 
subtypes and rapid antigenic drift and shift, which 
present problems for traditional vaccine strategies. 
Attenuation or inactivation of viruses tends to 
reduce the strength and breadth of immune re- 
sponses, resulting in ineffective protection against 
antigenic alterations (1-3). Previous pandemics and 
recent influenza outbreaks highlight the need to 
develop safe vaccines that elicit effective immune 
responses and confer broad protection. 

The type I interferon (IFN) system is the major 
component of innate immune responses (4-6). 
The IFN response provides the first line of de- 
fense against viral infections by inducing the ex- 
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pression of hundreds of IFN-stimulated genes 
(ISGs), many of which have antiviral activities 
(7). The IFN response is also critical for dendritic 
cell maturation, development of B and T cells, 
and memory formation, bridging innate and adapt- 
ive immunity (8-22). Most viruses have evolved to 
efficiently suppress the production and function 
of IFN to allow replication in vivo. Thus, sys- 
tematic elimination of IFN-modulating functions 
from the virus presents a potential approach for 
vaccine development (fig. S1) (13, 14). Removing 
the most well-characterized IFN modulator in 
influenza virus—namely, the NS1 protein—has 
shown promise in a vaccine candidate (delNS1) 
in phase 1/2 clinical trials (74, 15). Although studies 
have suggested that influenza proteins other 
than NS1 have IFN-modulating functions (J6, 17), 
genome-wide identification and elimination of 
IFN-modulating functions without affecting viral 
replication fitness in vitro have remained challenging 
tasks. 

To tackle this challenge, we developed a quan- 
titative high-throughput genomics system, which 
combines saturation mutagenesis and next- 
generation sequencing, to comprehensively iden- 
tify IFN-modulating functions in the entire viral 
genome (18). This system has enabled us to quan- 
titatively measure the replication capacity of a 
large number of mutants in parallel under spe- 
cific conditions (18, 19). We performed compar- 
ative profiling of the entire influenza genome with 
and without IFN selection, which led to the iden- 
tification of IFN-modulating functions on multiple 
viral segments. By combining eight IFN-sensitive 
mutations across the viral genome, we generated 


a hyper-interferon-sensitive (HIS) virus that is 
replication-competent in vitro but highly attenu- 
ated in IFN-competent hosts in vivo. The HIS 
virus showed desired properties as a safe and 
effective live attenuated influenza vaccine with 
robust humoral and cellular responses, and it 
provided broad protection against homologous 
and heterologous viral challenges in mice and 
ferrets. 


Fitness profile of the influenza A viral 
genome at single-nucleotide resolution 


The eight-plasmid reverse genetic system carrying 
the influenza A/WSN/33 (H1N1) virus genome 
was used for the construction of mutant plasmid 
libraries (20). The mutants were divided into 52 
sublibraries, each of which contained single- 
nucleotide mutations in a small genome region 
of 240 base pairs that were generated by error- 
prone polymerase chain reaction (fig. $2) (27-23). 
Viral mutant libraries were reconstituted in hu- 
man embryonic kidney 293T cells by cotransfect- 
ing the plasmid encoding the sublibrary of mutants 
with the other seven plasmids encoding wild- 
type (WT) viral proteins. To systematically iden- 
tify IFN-modulating functions, all viral libraries 
were selected in A549 cells with or without ex- 
ogenous IFN treatment (IFN-o2 at inhibitory con- 
centration 80) (29). Illumina sequencing was used 
to identify each mutant and to calculate the cor- 
responding frequency within each sublibrary. 
The relative fitness (RF) score of a mutant virus 
was calculated as the ratio of the relative fre- 
quency in the selected virus library to that in the 
plasmid library (Fig. 1A and table S1). There were 
strong correlations between biological duplicates 
of transfection and of selection (fig. S3). We ob- 
served a clear separation of the distribution of 
fitness effects between synonymous mutations 
and nonsense mutations (fig. S4A), indicating 
effective selection on virus mutants. To further 
validate the accuracy of the fitness profiling, we 
randomly selected 26 missense mutations and 
characterized the corresponding mutant viruses 
individually. The replication capacity of each mu- 
tant was highly correlated with the RF scores 
from the fitness profiling (fig. S4B). Using synony- 
mous mutations as a benchmark, 50.7% of missense 
mutations across the whole genome were dele- 
terious, in accordance with previous findings that 
single mutations are poorly tolerated in the ge- 
nomes of RNA viruses (fig. S5A) (24, 25). 


Systematic identification of 
IFN-sensitive mutations 


The RF scores of most mutants are correlated in 
the presence and absence of exogenous IFN 
treatment; however, we observed a set of muta- 
tions that were nearly neutral in the absence of 
IFN but highly deleterious under IFN selection 
(Fig. 1B and fig. S6). These putative IFN-sensitive 
mutations were widespread on multiple viral 
segments. Among all influenza A viral proteins, 
NS1 has been extensively studied for its interac- 
tion with the IFN pathway (19, 26, 27), which is 
validated both in our fitness profiling and indi- 
vidually constructed NS1 mutant viruses (fig. $7). 
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Fig. 1. Identification of IFN-sensitive mutations using quantitative high- 
throughput genomics. (A) Relative fitness (RF) scores for individual 
mutations in A549 cells with (right) and without (left) IFN selection across the 
influenza A/WSN/33 genome. (B and C) Identification of IFN-sensitive 
mutations with PB2 protein as an example [Protein Data Bank (PDB) ID, 
AWSB] (38, 39). Red and orange represent strong and intermediate IFN 
sensitivity, respectively. (D) Validation of IFN sensitivity with individually 
reconstituted mutants (n = 4). The top eight mutations on nonsurface virion 


To further explore IFN-modulating functions across 
the genome, we focused on IFN-sensitive muta- 
tions outside NS1, especially the solvent-exposed 
and structurally clustered residues in the poly- 
merase complex (PB2, PB1, PA, and NP), as well 
as the M1 and M2 proteins (Fig. 1C and fig. S6). 
Twenty-six mutations were constructed individ- 
ually, most of which were nearly neutral for viral 
replication with nearly intact polymerase activity 
(fig. S8). These included the previously charac- 
terized mutations PB2-N9D, which is known to 
counteract the inhibition of MAVS (mitochon- 
drial antiviral signaling protein)-induced IFN-B 
production by PB2 (76), and M1-D30N, which has 
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W, Trp; and Y, Tyr. 


been shown to induce IFN-f production (17). Sev- 
eral mutations significantly increased IFN sensi- 
tivity compared with WT, and the top eight were 
chosen for further characterization (Fig. 1D). Six of 
them (PB2-N9D, PB2-Q75H, PB2-T76A, M1-N36Y, 
MI-R72Q, and M1-S225T) elevated the expression 
of IFN-B and ISG54 (Fig. 1E and fig. S9) and stim- 
ulated nuclear translocation of IRF3 (fig. S10). We 
also observed that the IFN induction was MAVS- 
dependent and STING (stimulator of interferon 
genes)-independent (fig. S11). Moreover, these 
six mutants were not sensitive to IFN treatment in 
Vero cells, which are deficient in IFN production. 
However, the other two mutations (PB1-L155H and 


OoOnN 


+e 
* +: ak 
iI | 
x 
* 
mI fe I a 
oa kK F 
Ee as eee aE 2B 
Zany ~F © © 98 NHR AN SF 
c CG F- = = 2 rf SS = 
YN aaasew tse? 
Pe Sst ba 5 8 
a 
a a oD 
oa fF oO = 
aa 
oO 
Zz 


proteins are shown in black. (E) Induction of IFN-B expression in A549 
cells infected with WT virus or indicated mutants at 6 hours post-infection, 
with mock infection as control (Ctl) (n = 3). Error bars, SD. *P < 0.05, 

**P < 0.01 [two-tailed t test compared with WT (D) or with Ctl (E)]. 
Single-letter abbreviations for the amino acid residues are as 

follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His: |, lle; 

K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; 


PA-E181D) did not induce higher IFN production 
(Fig. 1E) and were still IFN-sensitive in Vero cells, 
suggesting that these mutants likely affect pro- 
cesses downstream of IFN production. 


Combining mutations increases IFN 
sensitivity and IFN induction in vitro 


To maximize IFN sensitivity and IFN induction, 
we combined three IFN-inducing mutations on 
PB2 (N9D, Q75H, and T76A), three on M1 (N36Y, 
R72Q, and S225T), and two previously reported 
ones on NS1 (R38A and K41A) to create the HIS 
virus. The growth of HIS virus in IFN-competent 
A549 cells showed significant attenuation compared 
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Fig. 2. The combination of mutations in HIS virus increases IFN 
sensitivity and IFN production. (A and B) Replication kinetics of WT, NS1 
mutant, and HIS viruses in A549 (A) and Vero (B) cells. (C) IFN sensitivity 
of WT, NS1 mutant, and HIS viruses (n = 4). (D) Induction of IFN-B expression 
by indicated virus in A549 cells at 6 hours post-infection, with mock 
infection as Ctl (n = 3). (E) Global gene expression in A549 cells infected 
with indicated viruses was examined by RNA sequencing (n = 2). The 
heatmap shows the genes that were significantly differentially expressed 

in HlS-infected cells compared with mock-infected cells. IFN response 
genes are marked on the left with black bars. (F) GO enrichment analysis 


of genes up-regulated in HIS-infected cells in comparison with mock- 
(top) or WT-infected (bottom) cells. (G) Induction of IFN-8 expression 

by indicated viruses in primary human alveolar macrophages (AMs), 
human alveolar epithelial cells (AECs), human small airway epithelial 

cells (HSAECs), and human bronchial epithelial cells (HBECs) at 6 hours 
post-infection, with mock infection as Ctl (n = 3). (H) Induction of indicated 
ISGs and inflammatory cytokines in primary human AMs at 6 hours 
post-infection (n = 3). Error bars, SD. *P < 0.05, **P < 0.01, ***P < 0.001 
[one-way analysis of variance (ANOVA) with Bonferroni multiple 
comparisons test]; n.s, not significant. 


with that of WT virus (1.4-log decrease at 36 hours 
and 1.8-log decrease at 60 hours) but was fully 
restored in IFN-deficient Vero cells (Fig. 2, A 
and B). The IFN sensitivity of HIS virus was 
significantly higher than that of the NS1-R38A/ 
K4JA mutant, indicating an independent effect 
of mutations on PB2 and M1 (Fig. 2C). Gene 
expression data from lung epithelial and mac- 
rophage cell lines (A549 and THP1) showed that 
HIS virus induced higher IFN production and 
responses (Fig. 2D and fig. $12, A to C). Using RNA 
sequencing, we evaluated the global gene expres- 
sion changes in A549 cells infected with WT, 
NS1-R38A/K41A, or HIS virus. At 6 hours post- 
infection, the expression of 120 genes was signifi- 
cantly up-regulated (fold change > 2 and P < 0.001) 
in HIS-infected cells, of which 24% were IFN 
response genes (Fig. 2E, fig. S12D, and table $2). 
Gene Ontology (GO) enrichment analysis revealed 
that the pathways related to IFN production and 
response were the dominant ones activated by 
HIS virus, to a greater extent than by WT or 
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mock infection (Fig. 2F). Furthermore, HIS virus 
induced negative regulators of apoptosis pro- 
cess, such as TNFAIP3, an important inhibitor of 
TNF-mediated apoptosis. Slower cell death was 
observed with HIS infection than with WT in- 
fection (fig. S12E). 

We further defined the phenotypes of HIS 
virus with a panel of human lung cells, including 
immortalized small airway epithelial cells, bron- 
chial epithelial cells, primary alveolar epithelial 
cells, and primary alveolar macrophages (Fig. 2G). 
HIS virus induced the strongest up-regulation of 
IFN-B expression (~50-fold relative to WT) in the 
primary alveolar macrophages, an important tar- 
get for influenza infection (Fig. 2G), and greater 
up-regulation of ISGs than WT virus (Fig. 2F). 
HIS virus did not enhance the expression of 
other inflammatory cytokines [CXCL1, CXCL5, or 
interleukin-1f (IL-18)] in the infected macrophages, 
highlighting its specific effects on the IFN pathway 
(Fig. 2H). The phenotype of HIS virus is not lim- 
ited to the WSN background: Introducing these 
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eight mutations into another HIN1 strain of 
influenza, A/PR8/34 (PR8-HIS), led to a similar 
phenotype (fig. S12, F and G). The up-regulation 
of the IFN pathway requires active viral infection, 
given that formalin-inactivated HIS virus lost 
the ability to induce higher IFN-B expression 
(Fig. 2G). 


HIS virus is highly attenuated in 
IFN-competent mice and ferrets 


We next measured the replication and patho- 
genesis of HIS virus in mice and ferrets, the most 
commonly used animal models for influenza virus. 
BALB/c mice were intranasally inoculated with 
WT or HIS virus at different doses. Whereas the 
median lethal dose of WT virus was 5 x 10° TCID50 
(50% tissue culture infective dose), and 1 x 10° 
TCID50 caused obvious body weight loss in all 
animals, neither weight loss nor indicative clinical 
symptoms were observed in HIS-infected mice 
given 1 x 10’ TCIDSO, the highest dose that we 
have tested (Fig. 3, A and B). To compare the HIS 
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Fig. 3. HIS virus is replication-deficient in vivo and induces a transient 
IFN response. (A and B) Survival rate and percentage of body weight loss 
after intranasal infection (n = 5). (C and D) Viral titers at day 2 post-infection 
(n = 4) (C) and replication kinetics (n = 3) (D) of WT and HIS viruses in mouse 
lung tissues. (E) Induction of indicated ISGs in mouse lung tissues at 6, 24, 
48, and 120 hours (h) post-infection (n = 3), shown as fold of induction over 
mock infection. RNase H, ribonuclease H. (F) Gene expression of indicated 
inflammatory cytokines in mouse lung tissues was examined by RNA 
sequencing (n = 2). (G) HE (hematoxylin and eosin) staining of lung tissues at 


virus approach with the live attenuated vaccine 
strategy used in FluMist, we incorporated the 
five cold-adapted (CA) mutations from FluMist 
into the WSN background and generated a WSN-CA. 
virus (28, 29). WSN-CA virus replicated well at 33°C 
but was highly attenuated at 39°C and induced 
IFN-B expression to a similar level as WT virus, 
which was significantly lower than that induced 
by HIS virus (fig. S13). By day 2 post-inoculation, 
replication of HIS virus in mouse lung tissues 
was significantly lower than that of WT virus 
(~3.6-log decrease) or the NS1-R38A/K41A mu- 
tant (~2-log decrease) and comparable to that of 
WSN-CA virus (Fig. 3C and fig. S14A). In contrast 
with the robust viral replication observed for WT 
infection, which peaked at 48 hours, no increase 
in viral copy number was detected in HIS-infected 
mice at any tested time point (Fig. 3D). PR8-HIS 
virus was also significantly attenuated com- 
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pared with WT PRS& virus in mouse lung tissues 
(fig. S14B). Although highly attenuated in repli- 
cation, HIS virus showed transient yet significant 
up-regulation of IFN and ISGs at 6 and 24 hours 
post-infection, after which the response was di- 
minished (Fig. 3E). In contrast, WT virus induced 
a robust pro-inflammatory response throughout 
the course of infection, exemplified by the high 
induction of CXCL10 at 48 and 120 hours post- 
infection (Fig. 3F). These results correlate well 
with histological analysis of infected lungs and 
cytospins of bronchoalveolar lavage (BAL) fluid 
(Fig. 3, G and H, and fig. $14, C to G). HIS-infected 
lungs showed infiltration of neutrophils and lym- 
phocytes at day 2 post-infection; however, the 
infiltration was transient and cleared by day 9. 
Sustained inflammation and tissue damage was 
observed for WT-infected lungs, which became 
more severe by day 9 post-infection (Fig. 3H and 
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day 9 post-infection. Thick arrows, bronchioles; thin arrows, vessels; red triangles, 
inflammatory cell infiltration. (H) Percentage of neutrophils, monocytes, and 
lymphocytes in BAL cytospins at day 9 post-infection (n = 3). (1) Cytokines in 
BAL samples measured by Luminex multiplex assay (n = 4). (J) Replication of 
indicated viruses in lung tissues of IFNAR”~ mice (n = 4). (K) Viral titer of WT 
and HIS viruses in ferret nasal wash, trachea, and lung tissues (n = 3). Dashed 
lines represent detection limits. Error bars, SD. *P < 0.05, **P < 0.01, ***P < 
0.001 [log-rank test for (A); ANOVA with Bonferroni multiple comparisons test 
for (C), (H), and (J); and two-tailed t test for (D), (E), (1), and (K)]. 


fig. S14, H and I). We also examined the cytokine 
response in the BAL samples at 48 hours post- 
infection by means of Luminex multiplex assay 
(Fig. 3I and fig. S14, J and K). WT infection showed 
significantly higher levels of IL-6 and CXCL1, con- 
sistent with the observed severe inflammation. In 
contrast, HIS virus induced higher amounts of 
IL-12 and G-CSF, which is important for granulo- 
cyte stimulation and T cell development. Further- 
more, replication of HIS virus was fully restored 
to WT levels in IFNAR~~ mice, indicating that 
the inability to counteract IFN response was the 
underlying mechanism for the highly attenuated 
replication of HIS virus in wild-type mice (Fig. 3). 
In the ferret model, we also observed significant 
attenuation of HIS virus (Fig. 3K). By day 3 post- 
infection, HIS virus showed a ~2-log decrease in 
trachea and a ~1.5-log decrease in lung tissues 
compared with WT virus. Moreover, no infectious 
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Fig. 4. HIS virus induces strong adaptive immune responses in mice and 
ferrets. (A to D) HA-binding IgG (n = 7), HA neutralizing antibody (n = 7), and 
NP- and NA-binding IgG (n = 4) in mouse sera at day 28 post-vaccination. HAI, 
hemagglutinin inhibition. (E) HA-binding IgA in BAL samples at day 28 post- 
vaccination (n = 4). The optical density (OD) in ELISA was 450 nm. (F) HA 
neutralizing antibody levels in ferret sera at day 22 post-vaccination (n = 3). 
The dashed line represents the detection limit. (G) Mutations not 
neutralized by mouse sera (red) were mapped onto the HA structure (PDB ID, 
1IRUZ; n = 5) (40). The other five colors represent five well-characterized 
neutralization epitopes. (H) Tetramer staining of antigen-specific CD8 T cells 
in mouse lung (left) and spleen (right) at day 10 post-vaccination (n = 10). 


(I) Percentage of antigen-specific memory precursor effector cells in mouse lung 
and spleen (n = 3). (J) NP antigen-specific CD8 T cells during the secondary 
responses in lung tissues from mice vaccinated with indicated viruses (n = 4). 
(K) Intracellular IFN-y staining of CD4 T cells induced by the indicated 
viruses (n = 3). (Land M) Clonality of TCRB sequences of NP antigen-specific 
CD8 T cells during the primary (n = 5) (L) or secondary (n = 4) (M) 
responses. (N) Box plots show the fitness distribution of mutations on T cell 
epitopes or antibody epitopes. Error bars, SD. *P < 0.05, **P < 0.01, ***P < 
0.001 [ANOVA with Bonferroni multiple comparisons test for (A) to (D), (F), 
(H), (J), and (K); two-tailed t test for (1), (L), and (M); and Wilcoxon rank sum 
test for (N)]. 


viral particles were detected in nasal washes of 
HIS-infected ferrets, in contrast to the robust 
viral shedding observed during WT infection. 


HIS virus induces strong and broad 
adaptive immune responses 


We then examined the ability of the HIS virus to 
induce humoral and cellular responses. Mouse 
sera and BAL samples were collected at day 28 
after single-dose (1 x 10* TCIDS50) vaccination with 
WT, HIS, or WSN-CA virus. HIS virus induced 
robust antibody responses, as measured by ELISA 
(enzyme-linked immunosorbent assay) and he- 
magglutinin (HA) inhibition and neutralization 
antibody assays (Fig. 4, A to E, and fig. $15). The 
level of HA antibody responses elicited by HIS 
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virus was lower than for WT virus, yet signifi- 
cantly higher than for the WSN-CA, inactivated 
WT, and inactivated HIS viruses (Fig. 4, A and B, 
and fig. $15, C and D). Immunoglobulin G (IgG) 
antibodies against NP, NA, and M1 proteins, which 
have been shown to play an important role in 
limiting viral replication (30, 37), were also detected 
in the sera of HIS-vaccinated mice at a level com- 
parable to that in WT-infected mice (Fig. 4, C and 
D, and fig. SI5E). Furthermore, mucosal immune 
responses, indicated by secretory IgA antibodies 
against HA and NP proteins, were elicited by HIS 
vaccination (Fig. 4E and fig. SI5F). Robust HA 
antibody responses were also observed in ferrets 
vaccinated with HIS virus (Fig. 4F and fig. S15G), 


vaccination. To examine the epitope coverage of 
the neutralizing antibodies generated by HIS virus, 
we profiled the HA mutants in the presence or 
absence of mouse serum antibodies by using the 
high-throughput genomic approach (32). Muta- 
tions not neutralized by sera were observed in 
both head (Ca2 and Sa sites) and stem regions, 
with no significant difference in the number or 
the distribution of mutations between the WT 
and HIS viruses (Fig. 4G, fig. S16, and table S3). 
This suggests that the breadth and diversity of 
neutralizing antibodies induced by the HIS virus 
are comparable to those induced by the WT virus. 

In addition to humoral responses, HIS virus 
elicited NP and PB1 antigen-specific CD8 T cell 


which were sustained for at least 50 days post- 


responses, similarly to WT virus and much more 
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strongly than the WSN-CA, inactivated WT, and 
inactivated HIS viruses (Fig. 4H and fig. S17, A to 
D). The CD8 T cells induced by the WT and HIS 
viruses had a similar capacity for IFN-y produc- 
tion upon stimulation by viral epitope peptides 
(fig. S17E). We further examined the phenotypes 
of virus-specific T cells by quantifying the expres- 
sion of KLRG1, CD127, CD44, CD62L, and CCR’. 
By day 21 post-infection, the NP and PB1 antigen- 
specific CD8 T cells induced by the WT and HIS 
viruses displayed similar levels of memory precur- 
sor effector cells with a CD127"*KLRGI™ phe- 
notype and short-lived effector cells with a 
CD127"" KLRGI"=" phenotype (Fig. 41 and fig. SI7F). 
These virus-specific CD8 T cells also displayed a 
similar effector/memory phenotype, as measured 
by CD62L, CD44, and CCR7 expression (fig. S17, 
G and H). Consistently, after challenge infection 
at 1 month post-vaccination, HIS virus induced 
the secondary CD8 T cell responses similarly to 
WT but more strongly than WSN-CA virus (Fig. 
4J and fig. S171). Moreover, similar frequencies of 
influenza-specific CD4 T cells were elicited by the 
WT and HIS viruses (Fig. 4K). To examine the 
diversity of the primary and secondary T cell re- 
sponses, we analyzed the T cell receptor repertoire 
by sequencing the f T cell receptor (TCRB) loci of 
NP-specific CD8 T cells in mice vaccinated with 
WT or HIS virus. The Vf usage and clonality for 
both primary and secondary T cell responses were 
comparable between the WT and HIS viruses, 
documenting the diversity of T cell lineages induced 
by HIS vaccination (Fig. 4, L and M, and fig. S18). 

We analyzed the potential impact of immune 
responses on the viral genome at the population 
level. Our whole-genome fitness profiling provides 
a data set for examining the genetic flexibility of 
viral sequences. We calculated the fitness cost of 
mutations in the previously identified B and T cell 
epitopes. Mutations on several T cell epitopes, but 
not on antibody epitopes, were generally cor- 
related with lower fitness scores (Fig. 4N and 
table S4). Our results suggest that an escape from 
T cell selection will impose a higher fitness cost 
for the virus, and thus T cell responses will be 
effective against vaccine escape. 


HIS virus protects against homologous 
and heterologous viral challenge 


We examined whether HIS vaccination could offer 
protection against homologous and heterologous 
viral challenges. Immunized mice were challenged 
28 days post-vaccination with 1 x 10* TCID50 of 
WT virus. Vaccination by HIS virus reduced viral 
replication by ~3 log, with no sign of weight loss 
(Fig. 5 and fig. S19). Complete protection without 
detectable viral titers in the lung was achieved 
with one vaccination at a high dose (1 x 10° TCID50) 
or two vaccinations at a low dose (1 x 10* TCID50) 
(Fig. 5B and fig. S19B). Similar protective effects 
were observed in ferrets, which were challenged 
with 1 x 10’ TCID50 of WT virus at day 35 post- 
vaccination. Nasal washes were collected at days 
1, 3, 4, 7, and 9 post-challenge, and no infectious 
viral particles were detected in nasal washes from 
HIS-vaccinated ferrets throughout this time period 
(Fig. 5C). 
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To test whether HIS vaccination provides 
protection against heterologous strains, we first 
challenged immunized mice with PR8 virus and 
examined viral titer at day 2 post-challenge. HIS 
vaccination reduced viral titer by ~3 log compared 
with mock vaccination and significantly more than 
WSN-CA vaccination (fig. S19C). We further chal- 
lenged vaccinated mice with a lethal dose of three 
different influenza strains: HIN1 subtypes A/PR8/ 
34 and A/Cal/04/09 and H3N2 subtype A/X-31. 
Protection by HIS vaccination was observed in 
all measures, including survival rate, percentage 
of body weight loss, and clinical scores (Fig. 5, D 
and E, and fig. S19D). Strong secondary antigen- 
specific T cell responses were observed in the 
challenged mice for all strains (fig. S20). HIS 
vaccination also protected ferrets from heter- 
ologous A/Cal/07/09 challenge, as shown by 
viral titer in nasal washes and percentage of 
body weight loss (Fig. 5F and fig. S19E). 


Discussion 


Conventional approaches to develop vaccines render 
the virus avirulent but also reduce immunoge- 
nicity. We developed a quantitative high-throughput 
genomics approach to systematically identify 
and eliminate immune-modulating functions 
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in the virus genome while maintaining replica- 
tion fitness in vitro. This is a systems-based strategy 
to enhance viral immunogenicity while attenuat- 
ing replication and pathogenesis. In this proof-of- 
principle study, we generated a HIS virus with a 
combination of eight IFN-sensitive mutations. 
These mutations also induced higher IFN pro- 
duction and response. We demonstrated that HIS 
virus is highly attenuated in vivo but is able to 
induce transient IFN responses, elicit robust and 
diverse humoral and cellular immunity, and provide 
protection against homologous and heterologous 
viral challenges in mice and ferrets. 

Recent studies have suggested several strategies 
to design live attenuated vaccines (14, 15, 33-37). 
Our method is distinctive in the following aspects: 
(i) We systematically investigated the whole viral 
genome, and we eliminated immune-evasion func- 
tions at multiple loci to obtain a safe strain that 
has no detectable replication in vivo; (ii) we 
selected mutants that induce a higher IFN response, 
because a transient IFN response has been shown to 
be essential for adaptive immunity, including the 
strong and diverse T cell responses; (iii) HIS virus 
selectively induced a transient IFN response but no 
other tested inflammatory responses, which reduced 
potential pathogenesis or side effects for future 
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Fig. 5. HIS virus protects mice and ferrets from broad viral challenges. (A and B) Viral load in 
mouse lung tissues at day 2 post-challenge (n = 4). DV, double vaccinations with HIS virus at 1 x 10* 
TCID50, 28 days apart; HD, high-dose vaccination with HIS virus at 1 x 10° TCID50. Dashed lines 
represent detection limits. (C) Viral replication kinetics in ferret nasal wash after WSN virus challenge at 
day 35 post-vaccination (n = 3). (D and E) Survival rate and body weight loss of HIS-vaccinated 

mice after challenge with homologous and heterologous strains (n = 10). (F) Viral replication kinetics in 
ferret nasal wash after A/California/07/09 virus challenge at 35 days post-vaccination (n = 3). Error 
bars, SD. *P < 0.05, ***P < 0.001 [ANOVA with Bonferroni multiple comparisons test for (A) and (B), 
two-tailed t test for (C) and (F), and log-rank test for (D)]. 
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clinical usage. We have also applied this approach 
to a DNA virus and generated an effective vaccine 
candidate. 

In general, this unbiased and quantitative high- 
throughput genomics system can be widely ap- 
plied to other pathogens to define the impact of 
genome-wide mutations under certain selection 
conditions. Similar profiling of a viral genome 
can be performed with other immune compo- 
nents, such as cytokines, natural killer cells, or 
T cells, in vitro and in vivo. Inactivating additional 
immune evasion functions in the virus will further 
increase the safety and immunogenicity of its 
derivatives for prevention or therapy. 
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NANOROBOTICS 


A self-assembled nanoscale robotic 
arm controlled by electric fields 


Enzo Kopperger,’* Jonathan List,’* Sushi Madhira,” Florian Rothfischer,* 


Don C. Lamb,”**** Friedrich C. Simmel™*+ 


The use of dynamic, self-assembled DNA nanostructures in the context of nanorobotics 
requires fast and reliable actuation mechanisms. We therefore created a 55-nanometer—by— 
55-nanometer DNA-based molecular platform with an integrated robotic arm of length 

25 nanometers, which can be extended to more than 400 nanometers and actuated with 
externally applied electrical fields. Precise, computer-controlled switching of the arm 
between arbitrary positions on the platform can be achieved within milliseconds, as 
demonstrated with single-pair Forster resonance energy transfer experiments and 
fluorescence microscopy. The arm can be used for electrically driven transport of molecules 
or nanoparticles over tens of nanometers, which is useful for the control of photonic and 
plasmonic processes. Application of piconewton forces by the robot arm is demonstrated in 
force-induced DNA duplex melting experiments. 


anoscale robotic systems will enable the 

programmable synthesis and assembly of 

molecular materials from the bottom up. 

Components of such systems were previ- 

ously created with the tools of supramo- 
lecular chemistry (J-4) and bionanotechnology 
(5). In particular, DNA self-assembly (6, 7) has 
been used successfully to create nanoscale robotic 
walkers (8-13), assembly lines (74), movable mo- 
lecular arms (75-18), and molecular mechanisms 
(19, 20). However, as a result of being driven by 
DNA hybridization reactions (8-10, 13-16, 18, 19), 
deoxyribozyme (11) or enzyme (12) action, changes 
in buffer composition, or using photoswitchable 
components (17), these systems were very slow, 
had a low assembly or operation yield, or were 
unable to exert appreciable forces against exter- 
nal loading. In one of the most successful meth- 
odologies (21, 22), DNA machines are driven 
through their operation cycle by hybridization 
with fuel and antifuel strands using toehold- 
mediated strand displacement reactions. Although 
this approach has the advantage of sequence 
addressability, DNA hybridization and strand- 
exchange reactions are slow, and structural switch- 
ing often occurs with low yield. In our experiment, 
we deliberately abandoned sequence-specific 
switching and used electrical fields to move the 
components of a DNA machine with respect to 
each other. We thus gain many orders of mag- 
nitude in operation speed, almost perfect switch- 
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ing yield, and the capability of computer-controlled 
nanoscale motion and positioning. 


A DNA-based molecular platform with 
an integrated robotic arm 


The actuator unit of our system is composed of 
a 55-nm-by-55-nm DNA origami plate with an 
integrated 25-nm-long arm defined by a DNA 
six-helix bundle (6HB) (Fig. 1A), allowing for a 
high-yield, one-pot folding procedure. For the 
rigid DNA plate, we used a crossed two-layer 
scaffold routing in which the top layer is rotated 
by 90° with respect to the bottom layer (supple- 
mentary materials and methods) (23). The 6HB, 


Fig. 1. A molecular platform 
with an integrated rotatable 
positioning arm. (A) Sketch 
of the DNA origami structure in 


functioning as the robot arm, is connected to the 
top layer of the base plate via a flexible joint 
created by two adjacent scaffold crossovers with 
three and four unpaired bases, respectively (see 
supplementary text section on the design of the 
joint) (23). Successful assembly of the structure 
with ~90% yield was verified using transmis- 
sion electron microscopy (TEM) and atomic force 
microscopy (AFM) (Fig. 1, B and C, and fig. $1) 
(23). Consistent with our design, AFM indicates 
a height of 4 nm for the base plate and an addi- 
tional 4 nm for the 6HB arm. 

We first used single-molecule multicolor Forster 
resonance energy transfer (FRET) experiments 
to investigate diffusive motion of the arm with 
respect to the base plate (Fig. 2). For these ex- 
periments, we extended two staple strands on 
opposite sides of the plate with an identical short 
docking sequence, whereas a staple strand on 
the arm was extended with the complementary 
sequence. Transient binding of the arm results 
in stochastic switching between the two docking 
sites, which we observed with the help of three 
reporter dyes: a FRET donor at the tip of the arm 
and two different acceptor dyes at the docking 
sites (Fig. 2A). A typical trace of stochastically 
alternating FRET signals is shown in Fig. 2B. 
Upon donor excitation, a high donor fluorescence 
(blue) indicates a freely diffusing arm, whereas a 
high acceptor fluorescence (green or red) indi- 
cates docking at the respective site. Dwell times 
for the three states were extracted from fluores- 
cence traces of more than 1000 robot-arm plat- 
forms via a hidden Markov model analysis (24) 
(fig. S2 and supplementary methods) (23). As 
expected, the dwell time in the bound states in- 
creases with docking duplex length (Fig. 2C, top). 
The dwell time spent in the unbound state also 
increases (Fig. 2C, bottom), indicating slower 
diffusion and/or a reduced hybridization rate for 


side (top left) and top (right) 
view. The close-up in the per- 
spective view (bottom left) 
highlights the single-stranded 
scaffold crossovers that form 
the flexible joint. (B) TEM 


class-average (top) and single- 
particle (bottom) micrographs 
of the structure. (©) AFM 
image of particles on mica. 
Only structures for which the 
actuator arms are buried 
below the plates could be 
imaged with high contrast. For 
imaging, the arms were fixed 
to the plates with a 10-bp 
duplex formed between two 
staple extensions on the 
plate and the tip of the arm. 
(Inset) Height profile of a 


platform measured along the direction indicated by the red arrow. 
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Fig. 2. Stochastic switching experiments. (A) For single-molecule 
multicolor FRET experiments, a donor fluorophore (Alexa Fluor 488) is attached to 
the six-helix bundle (GHB) arm and two acceptor fluorophores (ATTO 565 and 
ATTO 647N) are connected to staple-strand extensions on opposite sides of the 
ate. The pictograms on the left show hybridization of an extended staple of 

he arm to the staple extension of the base plate labeled with ATTO 647N. The 
length of the docking duplex was varied between 8 and 10 bp. A schematic 
hree-dimensional representation is shown on the right. (B) Fluorescence traces 
btained from the three fluorophores during donor excitation of the structures 
ontaining 9-bp docking duplexes. The change between green and red 
uorescence indicates switching of the arm between corresponding docking sites. 
The zoomed-in view (bottom) reveals short periods of free diffusion between 
unbinding and rebinding events during which the donor (blue) fluorescence is 
dominant. a.u., arbitrary units. (©) Average dwell times for the bound and unbound 
states and their dependence on duplex length. Dwell times for the bound states 
(high acceptor signals shown in red or green; top panel) correspond to the times 
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spent at the respective docking site. Dwell times for the unbound state (high donor signal shown in blue; bottom panel) represent the length of the traversal periods of the 
freely diffusing arm. (D) Average durations of the unbound states for various transitions and their dependence on duplex length. Corresponding to the start and end 
points of the traversal period (docking site or bound state shown in green or red before and after the unbound state), the unbound states can be classified as green>red 
and green-green or redgreen and red-red traversals. In (C) and (D), the error bars denote the SD of the mean from three independent measurements. 
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Fig. 3. External electric control of the robotic arm. (A) Two pointer 
extension designs for the robot arm and corresponding TEM images. The 
blue, linear extension pointer has a length of 411 nm (total length from 
center of rotation to tip: 436 nm). The orange pointer has a shape- 
complementary connection that withstands higher torque (total length: 
354 nm; pivot point to tip: 332 nm; resulting arm extension: 308 nm). 
(B) Cross section and (C) top and isometric view of the cross-shaped 
electrophoretic sample chamber. PMMA, poly(methyl methacrylate); 

U, voltage. (D) Schematic depiction of the experimental setup with four 
electrodes. (E) Fluorescence microscopy images of three structures that 
are switched in the electric field. For the highlighted particle, movements are 
shown as snapshots and kymographs. The green and blue arrows indicate 
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the axes chosen for the kymographs. (Top) Switching left and right with 

1 Hz. (Bottom) Switching up and down with 1 Hz. (F) (Top) One clockwise 
turn of 1-Hz rotation. (Bottom) Kymographs showing multiple turns of 
clockwise rotation followed by multiple counterclockwise turns, separately 
for the x and y axes and as a blue and green overlay. Reversal of the voltage 
and, thus, of the rotation direction is indicated by the red arrowhead (movie 
S1) (23). (G) Kymographs (x and y projections) obtained from a frequency 
sweep from O to 8 Hz and back, shown as an overlay of the kymographs 
along the x and y axes. (H) High-speed 360° clockwise and counter- 
clockwise rotation with 25 Hz. For each frame, the center of the pointer 
tip is indicated by a red cross. Reversal of the rotation direction is indicated 
by red arrowheads (movie S2) (23). Unlabeled scale bars, 1 um. 
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Fig. 4. Controlled hybridization and force-induced duplex dissociation. 
(A) Field-controlled switching of the extended robot arm between two 

9-bp docking positions. (Left) Scheme of the setup. (Right) Single-molecule 
localization image of pointer positions acquired during electrical rotation at 
1 Hz. The number of localizations is increased at angles corresponding to 
the two docking positions. (B) Angle plotted over time for 1-, 2-, and 4-Hz 
rotation with 110 V. The arm shows pronounced lagging for two angles 
(highlighted by gray bands). Higher frequencies result in a larger number of 
missed turns, which are indicated by the red arrowheads. (C) Unzipping of a 


longer docking duplexes. Observed state transi- 
tions can be classified into transitions from one 
binding site to the other (green—red or red— 
green) or rebinding events to the same docking 
site (green—green or red—red). When the arm 
initially unbinds from the green docking site, 
it binds to either site with roughly the same 
transition time (Fig. 2D, top). Conversely, arms 
starting at the red docking site have a higher 
tendency to return to the same site (Fig. 2D, bot- 
tom). This bias is consistent with the expected 
orientation of the arm on the base plate, which is 
designed to point toward the red docking site (see 
Fig. 1A). The corresponding higher effective con- 
centration of the arm in the vicinity of the red 
docking site results in faster rebinding transi- 
tions (16). Photophysical origins of the observed 
changes in the FRET signal (such as fluorescent 
dark states or environmental quenching of the 
fluorophores) were excluded by performing milli- 
second alternating laser excitation (25) experi- 
ments (fig. S3 and supplementary materials and 
methods) (23). 


Modular extension with pointer structures 


To facilitate direct observation of the arm’s motion 
by diffraction-limited fluorescence microscopy, 
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we designed two versions of pointer structures 
that were multiply labeled with the fluorophore 
ATTO 655. Version one extended the arm lin- 
early by 411 nm (Fig. 3A, blue). Version two ex- 
tended the arm by 308 nm (Fig. 3A, orange) 
and was modularly plugged into the robot arm 
via a shape-complementary connector structure, 
creating a more stable connection between pointer 
and arm to allow for better torque transmission. 
Both pointers are based on a rigid 6HB with a 
persistence length >1 um (26). The two designs 
were motivated by the differing requirements 
for the experiments described below. For rota- 
tional diffusion experiments in the absence of 
docking sites, we found that the linear pointer 
interacted less with the base plate than the shape- 
complementary pointer (fig. S4) (23). However, 
when used to exert forces, the linear pointer dis- 
played a reduced stability and tended to break at 
the connection site (supplementary text) (23). In 
the presence of docking sites, single-molecule 
localization images of both pointers were con- 
sistent with the positions of the docks on the 
platform, proving that the extensions point along 
the axis of the short arm (fig. S5) (23) and that 
the interactions with the docking sites domi- 
nated over unspecific sticking. 
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before 


rotation 


ce ee OS 


after 
rotation 


during 
rotation 


20-bp DNA duplex with the extended robot arm. (Left) Extensions from the 
platform and from the arm feature a short 8-bp strain-relief domain that 
prevents the staple strands from being pulled out of the structure. (Right) 
Experiments with two example particles are shown. Without an electric 
field, the arm is fixed at one of two docking positions on the base plate. 
(D) Rotation requires unzipping of the duplex, which is shown in the images 
(red, before rotation; violet, during rotation; and blue, after rotation) and 
kymographs at the bottom. Particle #1 rebinds to the starting position, 
whereas particle #2 rebinds to the position on the opposite side. 


Electrical control of the robot arm 
To realize dynamic external control of the robot 
arm, we applied electrical fields to the system— 
a natural choice for the manipulation of charged 
biomolecules (27). Electrical fields have been 
previously used only to stretch or orient substrate- 
immobilized DNA duplexes (28) but not to dynam- 
ically control the conformation of nanomechanical 
DNA devices. We created a cross-shaped electro- 
phoretic chamber constituted by two perpendic- 
ular fluidic channels intersecting at the center of 
a microscopy cover slip, with two pairs of plati- 
num electrodes inserted into the four buffer re- 
servoirs (Fig. 3, B and C, and fig. S6) (23). DNA 
nanostructures immobilized at the center of the 
cross chamber experience a superposition of the 
fields generated by the electrode pairs. Hence, a 
voltage can be applied to arbitrarily control the 
pointing direction of the arm (Fig. 3D). 
Electrical actuation of the arms results in a 
movement of the pointers, which we observed 
with an electron-multiplying charge-coupled de- 
vice camera using total internal reflection fluo- 
rescence (TIRF) microscopy. In Fig. 3, we show 
switching of an arm in two perpendicular direc- 
tions (Fig. 3E), as well as rotation with a constant 
frequency of 1 Hz (Fig. 3F) and with variable 
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Fig. 5. Electrically controlled movement of molecules and nanoparticles. 
(A) Configuration of the robot arm with shape-complementary extension for 
transport of the FRET donor Alexa Fluor 488 between two 9-nucleotide 
docking sites with the acceptors ATTO 565 and ATTO 647N. (B) Acceptor 
signals for continuous donor excitation for electrical rotation at 1 Hz (top), 

2 Hz (middle), and 4 Hz (bottom). (C) For application of the robot arm in 


frequency (Fig. 3G), ramping from 0 to 8 Hz 
and back to 0 Hz. Movie S1 (23) shows a range 
of movement patterns, underlining the capabil- 
ity of arbitrary angular control. To characterize 
faster arm movements, we used a complemen- 
tary metal-oxide semiconductor camera to record 
TIRF microscopy videos with a 2-ms time reso- 
lution. An image series taken from a video in 
which the robot arm was rotated back and forth 
at a frequency f = 25 Hz is shown in Fig. 3H (see 
also supplementary movie S2) (23). Kymographs 
displaying the projected motion of the arm’s 
pointer along the x and y axes show the expected 
sinusoidal characteristics. In a high-viscosity buf- 
fer solution containing 65% sucrose, motion of 
the arm was substantially slowed (fig. $7) (23). 
Next, we assessed the angular positioning pre- 
cision of the arm, which can be achieved in the 
absence of docking sites by the electrical field 
alone (fig. S8) (23). For large applied voltages 
(2120 V in our setup), the angular standard de- 
viation is ~0.1 radians, which translates to a po- 
sitioning precision of ~2.5 nm on the plate. 


Controlled interaction with docking 
positions on the platform 


To investigate the interaction of the arm with 
binding sites on the platform during electrical 
manipulation, we performed latching experiments 
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Fluaorescence (a.u.) 


switchable plasmonics, a 


the dyes while the robot 
data obtained with a 50- 


with the same arrangement of docks as in Fig. 2 
and an identical 9-base pair (bp) docking se- 
quence (Fig. 4A). When rotated at frequencies of 
f=1, 2, and 4 Hz, we observed temporary stalling 
of the pointer at the two angle positions that 
correspond to the two docking sites (Fig. 4B), 
indicating that the arm snaps into the bind- 
ing positions during rotation. Whereas the sig- 
nal followed the external control faithfully for 
f=1H~z, occasional skips occurred for f= 2 and 
4 Hz. This behavior is caused by the statistical 
nature of single-molecule duplex dissociation, 
whose frequency increases exponentially with the 
application of a force (29) and, in dynamic exper- 
iments, also depends on the force rate (29). 
Apparently, the dissociation rate (~0.4 s”’) (Fig. 
2B) of the docking duplex is sufficiently enhanced 
by the electrical force to follow the 1-Hz rotation. 
For higher frequencies, the duplex does not al- 
ways dissociate fast enough and the arm cannot 
follow the rotation of the electrical field. By con- 
trast, at a slower rotation speed of f= 0.1 Hz, we 
were able to observe dynamic latching to four 
different docking sites (fig. S9) (23). 

We next tested whether the robotic arm could 
wrest apart a 20-bp docking duplex, which is a 
stable structure at room temperature. Although 
the arm is firmly locked in place in the absence of 
an electrical field, it can be released from the 
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—— ATTO 565 
——— ATTO 655 


25-nm-long gold nanorod (AuNR) is attached to the 


side of the 6HB arm, and 11 ATTO 565 and ATTO 655 dyes are placed on 
opposite halves of the platform. (D) (Top) TEM micrograph of a structure with 
a 25-nm AuNR. (Bottom) Fluorescence traces for continuous excitation of 


arm is rotated at 1, 2, and 4 Hz (fig. S12) (23) for 
nm AuNR. 


docking site by actuating the arm and rotated 
as shown in Fig. 4, C and D. Unzipping is ex- 
pected to be most effective when the field is ap- 
plied perpendicularly to the fixed arm. As the base 
plates are randomly oriented with respect to the 
sample chamber, the field is slowly rotated at a 
frequency of 0.2 Hz to guarantee that each struc- 
ture has sufficient time to experience a strong 
enough unzipping force. When switching off the 
field during rotation at an arbitrary phase, the 
arm immediately localizes to an available dock- 
ing site. 

At the field strengths generated in our sample 
chamber, we do not expect field-induced melting 
of DNA duplexes as is observed, for instance, for 
DNA structures immobilized on electrode surfaces 
(30). Instead, the arm acts as a lever that mechan- 
ically transduces the electrical force acting on its 
large charge to the docking duplex. Force-induced 
unzipping of DNA duplexes has been previously 
achieved through the use of single-molecule manip- 
ulation techniques such as AFM (37) and optical 
tweezers (32) or within nanopores (33). These 
experiments have shown that DNA unzipping 
requires forces on the order of 10 to 20 pN, con- 
sistent with the typical binding free energy of 
DNA base pairs and their subnanometer spac- 
ing. A rough theoretical treatment (supplemen- 
tary text) (23) suggests that forces that can be 
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generated by the robot arm are on this scale. 
The ability to separate stable duplexes by force 
facilitates the electrically controlled dissociation 
of the arm from one docking site and its subse- 
quent placement at a different target position, 
which is then maintained at zero field (figs. S10 
and S11) (23). 


Electrically controlled movement of cargo 


To show controlled movement of a cargo mole- 
cule attached to the arm, we used the three-color 
FRET system already employed in the stochastic 
switching experiments (Fig. 5A). In contrast to 
those experiments, the donor fluorophore is ac- 
tively transported between two 9-nucleotide- 
long docking positions by rotating the arm with 
the help of the high torque extension at rotation 
frequencies of f= 1, 2, and 4: Hz, respectively. We 
observed alternating FRET traces (Fig. 5B) with 
the periodicity of the externally applied field. In 
agreement with the latching experiments (Fig. 
4B), higher rotation frequencies correspond with 
an increase in the number of skips. 

To demonstrate transport of inorganic nano- 
particles by the robot arm, we attached a gold 
nanorod (AuNR) to one side of the 6HB arm and 
probed its plasmonic interaction with red and 
green fluorophores immobilized on the platform 
(Fig. 5C). As shown in Fig. 5D and fig. S12 (23), 
the AuNR alternatingly modulates the fluores- 
cence of the fluorophores during rotation of the 
arm at the externally prescribed frequency. Elec- 
trical manipulation enables faster operation of 
switchable biohybrid plasmonic systems than pre- 
viously achieved with the fuel-strand technique 
(34). More sophisticated systems involving mul- 
tiple particles for the creation of switchable field 
enhancement or circular dichroism appear feasi- 
ble (35). 


Discussion 


We have introduced electrical actuation as a 
viable strategy for fast, computer-controlled oper- 
ation of biohybrid nanorobotic systems, which 
can exert forces at the molecular scale. Compared 
with nanoscale manipulation methods such as 
scanning probe techniques and optical or mag- 
netic tweezers, electrical control is contact-free 
and can be implemented with low-cost instrumen- 
tation. The robotic movements achieved are at 
least five orders of magnitude faster than pre- 
viously reported for the fastest DNA motor systems 
and are comparable to adenosine triphosphatase- 
driven biohybrids (5). The robot-arm system may 
be scaled up and integrated into larger hybrid 
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systems by a combination of lithographic and 
self-assembly techniques. For instance, the plat- 
forms can be easily connected to form long fila- 
ments with multiple DNA robot arms (fig. S13) 
(23) or to create extended lattices. The use of 
algorithmic self-assembly (36) will enable the 
creation of structures with different types of 
robot platforms with dedicated tasks. Lithographic 
patterning of the substrate (37-39) will further 
allow the fabrication of robot-arm arrays with 
defined platform orientations. By using nano- 
structured control electrodes, single robot arms 
could even be addressed individually, and their 
positioning state could act as a molecular mechan- 
ical memory. Combined with appropriate pick- 
up and release mechanisms (3, 40), it is conceivable 
that this technology can also be applied to DNA- 
templated synthesis (41). Electrically clocked syn- 
thesis of molecules with a large number of robot 
arms in parallel could then be the first step toward 
the realization of a genuine nanorobotic produc- 
tion factory. 
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Quantum droplets are small clusters of atoms self-bound by the balance of attractive and repulsive 
forces. Here we report on the observation of droplets solely stabilized by contact interactions in a mixture 
of two Bose-Einstein condensates. We demonstrate that they are several orders of magnitude more dilute 
than liquid helium by directly measuring their size and density via in situ imaging. We show that the 
droplets are stablized against collapse by quantum fluctuations and that they require a minimum atom 
number to be stable. Below that number, quantum pressure drives a liquid-to-gas transition that we map 
out as a function of interaction strength. These ultra-dilute isotropic liquids remain weakly interacting and 
constitute an ideal platform to benchmark quantum many-body theories. 


Quantum fluids can be liquids—of fixed volume—or gases, 
depending on the attractive or repulsive character of the in- 
ter-particle interactions and their interplay with quantum 
pressure. Liquid helium is the prime example of quantum flu- 
ids. For small particle numbers it forms self-bound liquid 
droplets: nanometer-sized, dense and strongly interacting 
clusters of helium atoms. Understanding the droplets’ prop- 
erties, which directly reflect their quantum nature, is chal- 
lenging and requires a precise knowledge of the short-range 
details of the interatomic potential (/, 2). Very different quan- 
tum droplets, more than 2 orders of magnitude larger and 8 
orders of magnitude more dilute, have recently been pro- 
posed in ultracold atomic gases (3, 4). Interestingly, these ul- 
tra-dilute systems enable a much simpler microscopic 
description, while remaining in the weakly interacting re- 
gime. They are thus amenable to well controlled theoretical 
studies. 

The formation of quantum droplets requires a balance be- 
tween attractive forces, which hold them together, and repul- 
sive ones that stabilize them against collapse. In helium 
droplets, the repulsion is dominated by the short-range part 
of the interatomic potential (/, 2). In contrast, for ultracold 
atomic droplets several distinct stabilization mechanisms 
have been proposed, including three-body correlations (3) 
and quantum fluctuations (4). The latter can be revealed in 
systems with competing interactions, where mean-field 
forces of different origins almost completely cancel out and 
result in a small residual attraction. In such systems, beyond 
mean-field effects remain important even in the weakly in- 
teracting regime. To first order they lead to the Lee-Huang- 
Yang (LHY) repulsive energy (5), comparable in strength to 
the residual mean-field (MF) attraction. Recently, ultracold 
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atomic droplets stabilized by quantum fluctuations have been 
realized in single-component magnetic quantum gases with 
competing attractive dipolar and repulsive contact interac- 
tions (6-11). In this case, the anisotropic character of the mag- 
netic dipole-dipole force leads to the formation of filament- 
like self-bound droplets with highly anisotropic properties (9, 
12, 13). Given the generality of the stabilization mechanism, 
droplets should in fact also exist in simpler systems with pure 
isotropic contact interactions. Even though they were origi- 
nally predicted in this setting (4), their experimental obser- 
vation has so far remained elusive. 

Here, we observe ultracold atomic droplets in a mixture 
of two Bose-Einstein condensates with competing contact in- 
teractions. Although a single-component attractive conden- 
sate with only contact interactions collapses (1/4, 15), 
quantum fluctuations stabilize a two-component mixture 
with inter-component attraction and intra-component repul- 
sion (4). There, the repulsion in each component remains 
large and results in a non-negligible Lee-Huang-Yang energy. 
The beyond mean-field repulsion and the residual mean-field 
attraction scale differently with the total density m (in three 
dimensions the scaling of the Lee-Huang-Yang and mean- 
field energy densities are E.yy « n°” vs. Eur « n’, respectively). 
Hence, there is always a density for which these contributions 
balance each other and droplets are stabilized. Unlike their 
dipolar counterparts, these mixture droplets originate exclu- 
sively from s-wave contact interactions and are therefore iso- 
tropic. We demonstrate the self-bound character of mixture 
droplets and directly measure their ultra-low densities and 
micrometer-scaled sizes. Moreover, by comparison to a sin- 
gle-component condensate with only contact interactions, we 
confirm that their stability stems from quantum fluctuations. 
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We perform experiments with two °K Bose-Einstein con- 
densates in states \*) =|F,m,)=|1,-1) and .) =|1,0) , where 


F is the total angular momentum and mz; its projection. An 
external magnetic field allows us to control the interactions, 
which are parameterized by the intra- and inter-state scatter- 
ing lengths a,,, a,, and a,, (see Fig. 1A). These have been 
computed according to the model of ref. (16). The residual 


mean-field interaction is proportional to da=a,, +,/a;,q), - 


The condition 5a = O separates the repulsive (Sa > 0) and 
attractive (Sa < 0) regimes. The experiment starts with a pure 


condensate in state \*) loaded in one plane of a vertical blue- 


detuned lattice potential (Fig. 1B). We choose a trapping fre- 
quency w./2a = 635(5) Hz large enough to compensate for 
gravity, but small enough for the system to be in the three- 
dimensional regime. Indeed, the vertical harmonic oscillator 
length apo = 0.639(3) ym exceeds the characteristic length of 
the most energetic Bogoliubov excitation branch by typically 
a factor of 3 (17). A vertical red-detuned optical dipole trap 
provides radial confinement in the horizontal plane. In order 
to prepare a balanced mixture of the two states, we apply a 
radio-frequency pulse at B ~ 57.3 G, which lies in the miscible 
regime (5a ~ 7do, where @» denotes the Bohr radius) (78). For 
all measurements, we verify independently the spin composi- 
tion of the mixture via Stern-Gerlach separation during time- 
of-flight expansion (Fig. 1B). Subsequently, we slowly ramp 
down the magnetic field at a constant rate of 59 G/s and en- 
ter the attractive regime 5a < O (7). We then switch off the 
vertical red-detuned optical dipole trap while keeping the lat- 
tice confinement, allowing the atoms to evolve freely in the 
horizontal plane. The integrated atomic density is imaged in 
situ at different evolution times. We use a high numerical ap- 
erture objective [<1 ym resolution, 1/e Gaussian width) along 
the vertical direction and a phase-contrast polarization 
scheme (79) which detects both states with almost equal sen- 
sitivity (Fig. 1) (77)]. 

Typical images of the mixture time evolution in the repul- 
sive and attractive regimes are displayed in Fig. 1C. For 5a = 
1.2(1) Go > O (top row), the cloud expands progressively in the 
plane, as expected for a repulsive Bose gas in the absence of 
radial confinement (20). In contrast, in the attractive regime 
da = -3.2(1) a < O (central row), the dynamics of the system 
are remarkably different and the atoms reorganize in an iso- 
tropic self-bound liquid droplet. Its typical size remains con- 
stant for evolution times up to 25 ms. In an analogous 
experiment with a single-component attractive condensate 


|.) of scattering length a = -2.06(2) ado < O, the system in- 


stead collapses (bottom row). In our experimental geometry, 
quantum pressure can never stabilize bright solitons because 
of the presence of weak anti-confinement in the horizontal 
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plane (17). At the mean-field level, the two-component attrac- 
tive case has a description equivalent to the single-compo- 
nent one, provided that the scattering length a is replaced by 
~5a/2 and the density ratio between the two components is 


fixed at n,/n, =./a,,/a,, (17). However, the role of the first 


beyond mean-field correction is very different in the two sys- 
tems, explaining their very different behavior. In the single- 
component case, the Lee-Huang-Yang energy depends on a 
and in the weakly interacting regime constitutes a negligible 
correction to the mean-field term. Therefore, its contribution 
is most easily revealed in strongly interacting systems (27). In 
contrast, in the mixture the mean-field and Lee-Huang-Yang 


a3 5/2 
energy densities scale as Eur 5an? and Ey, © ( [Q,.a) i) 


, respectively. Because af Qqrayy > |oa 


mental parameters they balance at accessible atomic densi- 
ties and stabilize liquid droplets (4). Therefore, the existence 
of liquid droplets is a striking manifestation of beyond mean- 
field effects in the weakly interacting regime. 

To further characterize the mixture, we perform a quanti- 
tative analysis of the images fitting the integrated atomic 
density profiles with a two-dimensional Gaussian (17). We ex- 
tract the atom number N and radial size o, and infer the peak 


, for typical experi- 


density n, =N / (x¥ ‘oro, ) by assuming a vertical size o, 


identical to the harmonic oscillator length dyno. Figure 2A (top 
and central panel) shows the time evolution of N and o, meas- 
ured for the interaction parameters of Fig. 1C. For 5a > 0 (red 
circles) the gas quickly expands and its atom number does 
not vary. Instead, for Sa < O (blue circles) the system is in 
the liquid regime and the radial size of the droplet remains 
constant at o, ~ 6 ym. Initially its atom number is N = 24.5(7) 
x 10°, corresponding to a peak density of mo = 1.97(8) x 10 
atoms/cm’. We attribute the subsequent decay of the droplet 
atom number (Fig. 2A, top panel) to three-body recombina- 
tion. The observed timescale is compatible with the measured 
density and effective three-body loss rate (17). By directly 
measuring the density of our droplets we confirm that they 
are more than 8 orders of magnitude more dilute than liquid 
helium and remain very weakly interacting. Indeed, the in- 
teraction parameters of each component are extremely small 
(n.a;,,na}, ~ 107). 

A closer view of the droplet size is displayed in the bottom 
panel of Fig. 2A. At t ~ 25 ms, o, starts to increase and the 
system behaves like the da > O gas. Following refs. (4, 9, 22, 
23), we attribute the dissociation of the droplet to the effect 
of quantum pressure, which acts as a repulsive force. As the 
atom number decreases, the relative weight between kinetic 


(Ex) and interaction energies (Eur , Exny) changes, for each 


term scales differently with N : Ex 0 N, Eve « N? and Eqny « 
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N®’?, Below a critical atom number, kinetic effects become suf- 
ficiently strong to drive a liquid-to-gas transition. To support 
this scenario, Fig. 2B depicts the radial size and atomic den- 
sity as a function of atom number. For 5a < 0 (blue circles) 
we observe that both size (top panel) and density (bottom 
panel) remain constant at large N. For decreasing atom num- 
ber, we observe a point where the size diverges and the den- 
sity drops abruptly. This indicates a liquid-to-gas transition, 
which takes place at the critical atom number N,. Below this 
value, the attractive gas is still stabilized by quantum fluctu- 
ations but expands because of kinetic effects, similarly to the 
repulsive mixture (Sa > 0, red circles). 

The liquid-to-gas transition is expected to depend on 5a, 
as sketched in the inset of Fig. 2B (top panel). We explore the 
phase diagram by tuning the interaction strengths with mag- 
netic field (see Fig. 1A). Figure 3A displays the measured size 
as a function of the atom number for magnetic fields corre- 
sponding to 5a between -5.5(1) a and -2.4(1) do. The critical 
number WN, shows a strong dependence on the magnetic field. 
The top panel of Fig. 3B presents our experimental determi- 
nation of the phase transition line. N, increases when the at- 
traction decreases, confirming that weakly bound droplets 
are more susceptible to kinetic effects and require a larger 
atom number to remain self-bound. Figure 3A also yields the 
droplet size as a function of atom number and magnetic field. 
In the bottom panel of Fig. 3B we display the measurements 
obtained at a fixed atom number N = 1.5(1) x 104, always 
larger than N, for our interaction regime. As expected, the 
droplet size decreases as the attraction increases. 

We theoretically describe the system using a simple zero- 
temperature model based on an extended Gross-Pitaevskii 
equation that includes both the vertical harmonic confine- 
ment and an additional repulsive Lee-Huang-Yang term. The 
latter is obtained assuming the Bogoliubov spectrum of a 
three-dimensional homogeneous mixture (17). In Fig. 3B we 
compare the experimental results to the predicted critical 
atom number and droplet size (solid lines). We find qualita- 
tive agreement for the complete magnetic field range with no 
adjustable parameters. In the weakly attractive regime the 
agreement is even quantitative, similarly to the dipolar Er- 
bium experiments of ref. (8). In contrast, when increasing the 
effective attraction, the droplets are more dilute than ex- 
pected. In particular, their size exceeds the theoretical pre- 
dictions by up to a factor of three. This is almost one order of 
magnitude larger than our imaging resolution, excluding fi- 
nite-resolution effects. Furthermore, the critical atom num- 
ber is a factor of two smaller than the theoretical value. 
Interestingly, a similar discrepancy was reported for dipolar 
Dysprosium droplets, with a critical atom number one order 
of magnitude smaller than expected (9). There, the deviation 
was attributed to an insufficient knowledge of the back- 
ground scattering length. This explanation seems unlikely in 
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the case of potassium (17), where excellent interaction poten- 
tials are available (16, 24, 25). 

Other physical mechanisms might be responsible for the 
diluteness of the observed droplets. Although our system is 
three-dimensional, the confinement along the vertical direc- 
tion might affect the Lee-Huang-Yang energy, modifying its 
density and interaction dependence or introducing finite-size 
effects; however, a description of quantum fluctuations in the 
dimensional crossover between two and three dimensions is 
challenging., . Interestingly, the almost perfect cancellation 
of the mean-field energy could reveal corrections other than 
the Lee-Huang-Yang term. Higher-order many-body terms 
might play a role, as proposed in ref. (3) for single-component 
systems. Taking them into account analytically requires 
knowledge of the three-body interaction parameters of the 
mixture, which are non-universal and difficult to estimate in 
our interaction regime. Alternatively, our results could be 
compared to ab initio quantum Monte Carlo simulations (26). 
Given the ultra-dilute character and simple microscopic de- 
scription of our system, a direct comparison to different the- 
oretical approaches could provide insights on _ yet 
unmeasured many-body effects. 

Future research directions include studying the spectrum 
of collective modes of the droplets (8). Their unconventional 
nature not only provides a sensitive testbed for quantum 
many-body theories, but should also give access to zero-tem- 
perature quantum systems (4) not present in the dipolar case 
(27). Our experiments could also enable the exploration of 
low-dimensional systems, where the enhanced quantum fluc- 
tuations make droplets ubiquitous (28). Finally, a coherent 
coupling between the two components (29) is expected to 
yield effective three-body interactions (30) and provide con- 
trol over the density dependence of the Lee-Huang-Yang term 
(31). 

Note added in proof: After submission of this work, re- 
lated experiments have been performed by the LENS group 
(32). 
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Fig. 1. Observation of quantum droplets. (A) Scattering lengths 
a (solid lines) and parameter 6a =a,, +,/a,,a,, (dashed line) vs. 


) 


. The condition 6a = O (dashed vertical line) 


magnetic field B for a °K mixture in states \t)= |F.m,)=|1,- 
and \)=|2 
separates Ke Lee ae (da > O, grey area) and attractive (5a < 
O, white area) regimes. (B) Schematic view of the experiment. 
Atoms are prepared in a plane of a blue-detuned optical lattice 
created by two beams intersecting at a small angle, and imaged 
in situ with a high numerical aperture objective [<0.97(4) um 
measured resolution, 1/e Gaussian width). The spin composition 
of the system is verified independently via Stern-Gerlach 
separation by a magnetic field gradient during time-of-flight 
expansion. During the preparation sequence, a red-detuned 
optical dipole trap (not shown) provides radial confinement (17). 
(C) Typical in situ images taken at time t after removal of the 
radial confinement but in the presence of the lattice potential. 
Top row: expansion of a gaseous mixture [B = 56.935(9) G and 
5a =1.2(1) ao > O]. Central row: formation of a self-bound mixture 
droplet [B = 56.574(9) G and 5a = —3.2(1) ao < O]. Bottom row: 
collapse of a single-component \“) attractive condensate [B = 
42.281(9) G and a = —2.06(2) ao < OJ. In our geometry, quantum 
pressure cannot stabilize bright solitons. Therefore, the 


existence of self-bound liquid droplets is a direct manifestation 
of beyond mean-field effects. 
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Fig. 2. Liquid-to-gas transition. (A) Atom number WN and radial size o, of the mixture for different evolution times f. 
The measurements are taken in the repulsive [5a = 1.2(1) ao > O, red circles) and attractive (5a = —3.2(1) ao, blue 
circles) regimes. Top panel: while for 5a > O the atom number in the gas remains constant, for 6a < O it decreases on 
a timescale compatible with three-body recombination (17). Central panel: the radial size of the droplet remains 
constant at o, » 6 um, demonstrating its self-bound nature. In contrast, the size of the gas increases continuously 
with time. Bottom panel: closer view of o, for 56a < O. For t > 25 ms the droplet dissociates and a liquid-to-gas 
transition takes place. The inset displays images taken at t = 25-35 ms in 2 ms increments, corresponding to the 
points in the grey area. (B) Radial size o, (top panel) and peak density no (bottom panel) vs. N. For 5a < O and large 
atom number both remain approximately constant, as expected for a liquid. For a critical atom number o, rises 
suddenly and no plunges, signaling the liquid-to-gas transition. In the gas phase, the 5a < O system behaves like the 
5a > O one. Inset (top panel): sketch of the phase diagram. In the liquid phase (blue region), observing the mixture at 
variable evolution times gives access to different values of N (black arrow). For all panels, error bars represent the 
standard deviation of 10 independent measurements. If not displayed, error bars are smaller than the size of the 
symbol. Additionally, N has a calibration uncertainty of 25% (17). 
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Fig. 3. Liquid-to-gas phase diagram. 
(A) Radial size of the mixture o, as a 
function of atom number N_ for 
different magnetic fields B, from 
strong to weak attraction (top to 
bottom). The critical atom number N- 
increases as attraction decreases. 
Solid lines display the 
phenomenological fit oN) = oo + 
A/(N — N.) used to locate the liquid- 
to-gas phase transition. (B) N. (top 
panel) and o, for fixed N =1.5(1)x10* 


(bottom panel) as a function of B. The 
upper horizontal axis shows the 
corresponding values of 5a. Solid lines 
are the predictions of an extended 
Gross-Pitaevskii model without fitting 
parameters (see text). Error bars for 
or correspond to the standard 
deviation of 10 — independent 
measurements. If not displayed, error 
bars are smaller than the size of the 
symbol. Error bars for B and N show 
the systematic uncertainty of the 
corresponding calibrations (17). 
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INDUCED SEISMICITY 


Hydraulic fracturing volume is 
associated with induced earthquake 
productivity in the Duvernay play 


R. Schultz,!* G. Atkinson,” D. W. Eaton,” Y. J. Gu,* H. Kao 


A sharp increase in the frequency of earthquakes near Fox Creek, Alberta, began in 
December 2013 in response to hydraulic fracturing. Using a hydraulic fracturing database, 
we explore relationships between injection parameters and seismicity response. We show 

that induced earthquakes are associated with completions that used larger injection volumes 
(10* to 10° cubic meters) and that seismic productivity scales linearly with injection volume. 
Injection pressure and rate have an insignificant association with seismic response. Further 
findings suggest that geological factors play a prominent role in seismic productivity, as 
evidenced by spatial correlations. Together, volume and geological factors account for ~96% 
of the variability in the induced earthquake rate near Fox Creek. This result is quantified by a 
seismogenic index—modified frequency-magnitude distribution, providing a framework to 


forecast induced seismicity. 


ubsurface injection of fluid may induce 

earthquakes (7) through anthropogenic al- 

teration of crustal stresses (2). In the case 

of hydraulic fracturing (HF), high-pressure 

injection of fluid intended to increase the 
permeability of tight shales has been known to 
trigger earthquakes (3), some of which are large 
enough to be recorded or felt regionally (4-8). 
Within the Western Canada Sedimentary Basin, 
the recent increase in seismicity has been largely 
attributed to HF (9). Moreover, earthquakes in 
the Duvernay Formation (J0) have been among 
the largest-magnitude events caused by HF com- 
pletions globally (17, 12). These events have ap- 
preciably increased the seismic hazard in the 
area, and felt ground motions have resulted in 
the implementation of a traffic light protocol 
(TLP) (73). 

Despite recent progress in characterizing the 
Duvernay-related earthquakes (14, 15), many 
scientifically critical questions have remained 
unresolved. For example, it is not clear why 
there was a large time delay (~3 years) between 
the first Duvernay play completion and the ini- 
tiation of HF-related earthquakes (14), after which 
time such earthquakes became frequent occur- 
rences. Moreover, it is not well understood why 
only a small subset of operations appear to be 
seismogenic (9). Within the Duvernay play specif- 
ically, only those completions located in the 
Kaybob region are seismogenic, whereas all HF 
completions in the Willesden Green and Edson 
regions have been seismically quiescent (fig. S1). 
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Geological factors have been suggested to con- 
tribute to the spatial distribution of these earth- 
quakes (16, 17). However, to date, there has been 
little understanding or documentation of the 
relative contributions of surface injection pres- 
sure, rate, and volume to the seismogenic process. 

To address these questions, we first examined 
the timing and location of earthquakes in rela- 
tion to HF completions in the seismogenic 
Kaybob region of the Duvernay play (Fig. 1). We 
compiled a database of all (~300) horizontal HF 
well completions in the Kaybob Duvernay up to 
February 2016 (because of the ~1-year period of 
confidentiality) from public, hard-copy regulator 
records. Individual wells are aggregated into 
~180 well “pads” on the basis of the proximity 
of multiple wells oriented along similar trajec- 
tories (fig. S2). A cursory examination of the 
time-averaged evolution of injected volumes per 
pad (Fig. 1B) indicates increasing pad design 
complexity and HF completion volumes, typical 
of maturing development in shale plays (18). 
The Kaybob Duvernay has been injected with 
more than 8.5 x 10° m? of fracturing fluid (as of 
February 2016) to stimulate well productivity. The 
observation of a relative increase in pad volumes 
before the first recorded earthquakes suggests 
that injected volume may be a controlling factor 
in the Kaybob Duvernay earthquake activity. 

Disposal is not likely to be a major factor in 
the induced seismicity in this area. The closest 
water-injection well to Crooked Lake that was 
actively injecting at depths similar to those of 
the Duvernay during the seismogenic period is 
~35 km away. Disposal wells within a 50-km 
radius of Crooked Lake injected only ~1.2 x 
10° m? of fluid during the Duvernay’s seismo- 
genic period (much less than the total volume 
involved in the HF operations). 

To examine the role of operational factors 
more closely, we associated clusters of seis- 
micity with seismogenic pads on the basis of a 
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spatiotemporal association filter (SAF), which 
identifies the causally closest pad(s) in time and 
then space (supplementary materials). The valid- 
ity of these associations can be demonstrated by 
comparison with prior case studies (J4, 15) and 
direct communication with the regulator and the 
anonymous companies responsible. Based on the 
SAF, ~10% of the pads and ~15% of the wells in 
the Kaybob Duvernay are associated with seis- 
micity. Of the completions associated with in- 
duced earthquakes, ~50% are single-well pads. 
Using this subset of associated pads, we then 
contrasted operational parameters at seismo- 
genic pads, wells, and stages with those of their 
parent distributions, as derived from the entire 
Kaybob region of the Duvernay play (Fig. 2). 
Statistical distributions of operational parame- 
ters (fig. S3) were analyzed using the Kolmogorov- 
Smirnov (KS) test (19) to discern whether it is 
likely that the seismogenic distributions are sam- 
pled randomly from their parent distribution. 
In these tests, we used the standard significance 
level of 0.05. The computed P values (Fig. 2) 
allowed a rejection of the hypothesis that the 
seismogenic distributions are subsampled from 
the parent Kaybob Duvernay distribution. On 
the basis of finding nonrandomness, the Mann- 
Whitney (MW) U test (20) was applied next 
to determine whether the seismogenic sub- 
sets have significantly larger median values of 
key operational parameters than do their parent 
distributions. We performed both one- and two- 
tailed MW tests to first determine whether sub- 
set median values are different from the parent 
distribution (two-tailed) and then further assess 
whether the subset median values are larger 
than their parent distributions (one-tailed). This 
analysis (P < 0.05) demonstrated that seismo- 
genic pads, wells, and stages are associated with 
larger injected volumes at a statistically signif- 
icant level (Fig. 2). Analogously, an analysis of 
HF in the Horn River Basin found a relationship 
between induced earthquake productivity and 
volume (27). Examination of pad, well, and stage 
pressures suggests no significantly compelling 
association (fig. $3). Although injection rate has 
been suggested as a driving factor for disposal- 
induced seismicity in the central United States 
(22), we did not observe a meaningful association 
with injection rates. Potentially, this discrepancy 
could be due to differences in injection opera- 
tions for disposal and HF; for example, the 
slowest Duvernay injection rates are nearly an 
order of magnitude faster than the critical rate 
threshold identified for seismogenic disposal (22). 
Thus, the effects of rate on HF-induced seismic- 
ity in the Kaybob region are either secondary, 
indiscernible, or negligible. The robustness of 
these findings was established using bootstrap 
(23) resampling sensitivity tests (supplementary 
materials and figs. S4, S5, and S6). 

The results of the KS and MW tests show 
that volume is a controlling factor for HF- 
induced earthquakes in the Kaybob Duvernay. 
Potentially, this observation may be the result 
of greater injection volumes allowing for larger 
stimulated reservoir volumes and thus greater 
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likelihood of intersecting a critically stressed 
fault (24). However, this does not necessarily 
imply that it controls the maximum possible 
magnitude. This finding echoes similar con- 
clusions from other induced cases (25). The 
relationship between volume and induced earth- 
quakes has important implications for the 
management of HF-related seismic hazard. For 
example, controls on seismicity related to volume 
would be affected by the time-dependent nature 
of increasing pad completion volumes within a 
maturing play (Fig. 1B and fig. $7). On the other 
hand, the use of lower pad completion volumes 
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— Ss 4 
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Fig. 1. HF-induced seismicity within the Duvernay play up to February 
2016. (A) Spatial distribution of induced earthquakes (red circles) associated 
with HF wells (dark gray “tadpoles”; associated wells are bolded and black) 
in the Duvernay play (purple shaded area) near Fox Creek (dark gray) and 
Crooked Lake (blue). The inset map shows the location of the Duvernay 
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117.4°W 


since the first TLP red light (Fig. 1B) could have 
resulted in an overall reduction in earthquake 
response from the Duvernay play. Whether this 
decrease in volume after the first TLP red light 
resulted in a net decrease in seismic hazard is 
more complicated, however, because hazard is 
dependent on multiple factors, some of which 
are independent of injection volume (26, 27). 
These points would require a physical model 
to validate the statistical earthquake-volume 
association. 

To validate this relationship between injected 
volume and earthquakes (28), we first considered 


117.2°W 117.0°W 
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116.8°W 


the Gutenberg-Richter frequency-magnitude dis- 
tribution (GR-FMD) (29, 30) 


Ny = 10710" qd) 


In this formulation, Nj, is the number of events 
greater than magnitude M, the a-value governs 
the rate at which earthquakes occur, and the 
b-value measures the proportion of relatively 
smaller events to larger ones. To consider the 
time-varying rate of induced earthquakes, mod- 
ifications have been suggested on the basis of 
solutions to the diffusion equation that incorporate 


116.6°W 


——+54.3°N 


54.2°N 


54.1°N 


Volume per Stage (1 o¢ m) 
Volume per Pad (1 0° m’) 


Formation in North America. (B) Timing and magnitude of induced earth- 
quakes (red circles) alongside stage volumes (gray) and a 30-pad running 
average of pad volumes (black line). Colored areas indicate Alberta Energy 
Regulator (AER) traffic light protocol (TLP) cut-offs (13). Initial wells stimulated 
during March 2010 were included in the analysis but are not depicted. 
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Fig. 2. Distributions of HF volumes in the Kaybob Duvernay. 
Volumes are plotted as histograms on a per-stage (A), -well 

(B), and -pad (C) basis for seismogenic (red) and all (gray) pads. 
P values from comparisons of complete volume distributions with 
seismogenic subsets are inset [KS, Kolmogorov-Smirnov; MW, 
Mann Whitney; 1(2)T, one(two)-tailed]. The dotted line depicts the 


required detection threshold volume, V,. 


the nonstationary effects of injection volume 
V@®—e.g., a = X + logy[V@] (25, 31-33). In 
this equation, © is the seismogenic index, an 
injection-invariant parameter that represents 
the seismotectonic response to increasing fluid 
pressure within the study area (25). Incorpora- 
tion of = in the GR-FMD modifies Eq. 1 to 
(25, 31-33) 

Nur = Vit) - 10"10°™ (2) 

We investigated the implications of Eq. 2 for 
the Kaybob Duvernay HF earthquakes by first 
compiling the time-dependent histories of com- 
pletion volumes and the number of earthquakes 
above the detection threshold (fig. S8). We ob- 
served a strong correlation between these varia- 
bles, especially if only the SAF-associated pads 
were considered (Fig. 3). The best-fit parameters 
of Eq. 2 are b = 0.90 + 0.03 and = = -1.8 + 0.2 for 
the detection threshold of local magnitude (/;) 
1.3 (supplementary materials and fig. S8). This 
regionally averaged & value near Fox Creek is 
very similar to = values computed for earth- 
quakes in the Brazeau Cluster, which are related to 
wastewater disposal in the Cordel Field (-2.1 + 0.2) 
(34), and to = values for numerous case studies 
worldwide (35). This is an important finding 
because it suggests that the regional seismic 
response to fluid injection in central Alberta is 
similar for both HF and disposal wells. 

We repeated the fitting process for individual 
clusters associated with seismogenic wells and 
found = and b-values that vary from -2.5 to -0.5 
and 0.7 to 1.7, respectively (fig. S9). The relatively 
high variability of these parameters likely reflects 
the difficulty in making robust parameter deter- 
minations from small data subsets. More com- 
plete earthquake catalogs would enable a more 
robust analysis of seismic parameters for the 
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individual clusters, improving confidence and 
possibly reducing variability. Overall, we found 
seismic values for individual clusters that are 
roughly similar to X values previously determined 
for clusters on a local array (5). The determina- 
tion of these values provides a powerful tool with 
which to forecast the expected number of future 
earthquakes at seismogenic wells and their likely 
magnitude distribution. 

In light of these findings, we consider the 
hypothesis that the ~3-year-delayed response in 
earthquake productivity was simply the result of 
the minimum injection volume (V.) required to 
raise the seismicity rate to a sufficient level for 
observation (i.e., so that it produces an earthquake 
larger than the regional detection limit). Con- 
servative upper-bound estimates using a detec- 
tion limit of Mj, 2.0 (the TLP requires operators 
to report all events of M;,, 2.0 and greater) and 
95% probability of exceedance suggest a V. of 
(8.0 + 0.2) x 10? m? (supplementary materials). 
This number is comparable to the average total 
injected volume at pads at the time of their first 
corresponding earthquakes, V; = (1.0 + 1.0) x 
10* m®. Although V, and VY, roughly agree, it is 
likely that V, is systematically larger than V, 
owing to insufficient resolution with which to 
discern aseismic stages or wells within a seismo- 
genic pad. Corroborating these results, studies 
in the Horn River Basin found that seismic re- 
sponse appeared to “turn on” during months 
where HF injection volumes were greater than 
2.0 x 10* m? (27). Because fewer than 10 pads 
in the Kaybob Duvernay have a volume per 
pad less than V,, we argue that it is unlikely 
that the spatial distribution of our catalog has 
been seriously biased by V.. We justify this claim 
through comparison of our regional catalog 
(14, 36) with catalogs supplemented by local 
operator networks (15). In both cases, we observe 
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the same spatial distributions of earthquakes 
and the same associations with seismogenic 
wells and regions. 

Although pad completion volumes have been 
increasing with time, it is interesting that more 
than 20 pads (~50%) that were completed before 
the first induced earthquake have volumes greater 
than 1.5 x 10* m®. This finding indicates that 
although volume appears to be a controlling 
factor for Kaybob Duvernay induced seismicity, 
other factors are also playing a nontrivial role. 
Prior work has suggested that the spatial dis- 
tribution of all induced seismicity within central 
Alberta has been influenced by geological factors, 
with earthquakes preferentially occurring along 
underlying reef margins (16, 37) or within regions 
of relatively higher formation overpressure (17). 

Following this rationale, we accommo- 
date spatial factors using a modification 
X= LY + logyo[5(7)| that explicitly introduces a 
spatial variable 7 so that Eq. 2 becomes 

Nu = V(t) - 8(7) -10710-™ (3) 
In our formulation, we define 5(7) to have binary 
values of 1 or 0 to indicate regions that do or do 
not experience earthquakes, respectively. In this 
limiting case, ~’ becomes equivalent to & [i.e., 
when 8(7') = 1]. Introducing this spatial term is a 
useful refinement, because numerous plays have 
been stimulated using similar per-well volumes 
(38, 39) with limited or undocumented seismic 
response. Even considering only the Duvernay, 
induced earthquakes have been restricted to one 
area, the Kaybob region (fig. S1). Furthermore, 
numerous high-volume Kaybob pads, wells, and 
stages are not associated with earthquakes (Fig. 2). 
In fact, the first seismogenic pads observed in 
the Duvernay (10) were also the first pads to com- 
plete in the most seismically susceptible region, 
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~10 km from Crooked Lake (fig. $7). The efficacy 
of introducing this spatial term for the Kaybob 
Duvernay is quantified by the significant enhance- 
ment of goodness-of-fit to the data when using the 
SAF (Fig. 3). In this sense, the SAF represents a 
rudimentary and empirical estimation of 5(7). 
Taken in the context of the original formula- 
tion, the spatial variability of 2 can be cast as 
L = -1.8 + log; (SAF). The incorporation of 
the SAF as an empirical estimate of 5(7) in a 
2-modified GR-FMD results in a model that 
accounts for ~96% of the seismic response 
variability within the Kaybob data set (Fig. 3B). 
This means that complexities related to other 
potential factors account for only the remaining 
4% of variability or constitute part of the proposed 
6(7). This is an important finding because these 
additional factors are numerous; they include well 
flow-back, effects of staged stimulation pressures, 
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well shut-in response, lagged seismicity relative to 
seismogenic pads, petroleum production, earth- 
quake aftershock sequences, operator seismic miti- 
gation strategies, varieties of fracturing fluids and 
proppants, SAF resolution limitations, interpad 
communication, poroelastic triggering effects, 
magnitude uncertainties, and spatiotemporal vari- 
ability in b-values or x’. This observation is further 
confirmed by applying additional filtering to the 
data (figs. S10 and S11). 

Although in reality, a fault is either reactivated 
or not, incomplete information about the sub- 
surface often prevents definitive assessments a 
priori. Our spatial parameter is a useful concept 
in this regard. Statistically, we can extend 8(7)) to 
represent the seismogenic activation potential, 
defined as the likelihood of a well inducing a 
detectable earthquake at a given location. We 
interpret this parameter as the probabilistic 
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Fig. 3. Number of earthquakes above the detection threshold (M,_ 1.3) versus cumulative injection 
volume. (A) Volumes from all HF operations within the Kaybob Duvernay play (red circles) are 

compared with the best fit of the data points (black line). (B) Analogous to (A), except only seismogenic HF 
pads are considered. Both panels are during a period where the detection threshold remains constant 

(July 2014 to February 2016). Well flow-back was not considered in computing volumes. In both panels, the 
goodness-of-fit of the data to the expected line is displayed with the R* (coefficient of determination) values. 


intersection of all geological conditions required 
to cause an induced earthquake at a given 
location—i.e., the spatial variability of the geo- 
logical susceptibility to induced seismicity. This 
interpretation is intuitive, owing to the deriva- 
tion of 5(7) from &, which constrains the seismo- 
tectonic state of the point of injection. For a 
demonstration of this interpretation, we con- 
sider the effects of distance to underlying fossil 
reef margins (16, 37) and formation overpressure 
(17) as regional proxies for faulting and stress, 
respectively. The fractions of seismogenically 
associated pads as a function of these proxies 
are plotted (Fig. 4), confirming that regions of 
development that are closer to the reef margin or 
more highly overpressured have been more likely to 
induce earthquakes (fig. S12). For western Canada, 
an entire basin-wide average of this activation 
probability in regions that are also coincident with 
viable HF plays appears to be less than 0.3% (9). 
Solely on the basis of the = modifications, it 
is possible to improve hazard estimates for future 
injections in real time during stimulation. Knowl- 
edge of fault size, hydraulic connectivity to prox- 
imal stimulation stages, and injected volumes 
that may be directed to reactivated faults can 
serve as input to calibrated models to estimate 
induced earthquake rates and magnitude dis- 
tributions. Conversely, microseismic monitoring 
of HF completions may be used to scrutinize the 
number and rate of induced events resulting from 
individual stage stimulations as an indicator of 
hydraulic connectivity to nearby faults. For ex- 
ample, HF pads that are oriented subparallel to 
the north-south-oriented fault planes (12, 14, 15) 
are likely to be in extended hydraulic communi- 
cation with seismogenic faults as compared with 
northeast-southwest-oriented pads. This ratio- 
nale may explain why the three largest-magnitude 
clusters occurred at north-south-oriented pads: 
Greater volume pumped into these faults would 
allow for more numerous induced events and 
thus an increased likelihood of a larger event. 
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Fig. 4. Statistics of HF operations in relation to the Swan Hills For- 
mation and Duvernay overpressure. (A) Distance of HF operations to 
the Swan Hills Upper Bank edge for seismogenic (red) and all (gray) pads, 
shown as a histogram. (B) Similar to (A), except each bin has been 
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normalized (red circles), and error bars have been added. (C) Pressure 
gradient for seismogenic (red) and all (gray) pads. (D) Fraction of 
seismogenic pads as a function of pressure gradient (red circles) with 
error bars from the standard deviation. 
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Extended to the regional scale, this study can be 
used to better inform earthquake rate models in 
induced seismic hazard forecasts. Coupled with 
an estimation of the seismogenic activation po- 
tential, our proposed framework (Eq. 3) would 
allow for the quantification of both induced 
earthquake rate and location models. Rate and 
location models are some of the most critical pa- 
rameters in forecasting hazard related to induced 
earthquakes (26, 27). Although this study has 
focused on the Fox Creek area, the proposed frame- 
work can be applied to other jurisdictions to im- 
prove the management of induced seismic hazard. 


We find that the most important operational 


parameter controlling induced earthquakes in 
the Duvernay play near Fox Creek is injected 
volume, which scales linearly with the total num- 
ber of earthquakes in a &-modified Gutenberg- 
Richter formulation. Conversely, injection pressure 
and rate appear unrelated to induced seismic- 
ity response near Fox Creek. Furthermore, wells 
that exhibit seismicity appear to display a strong 
spatial bias related to geological factors, which 
has a pronounced effect on the resultant seismic 
response. Last, this study provides a framework 
with which to incorporate the seismogenic activa- 
tion potential into seismic hazard analysis for 
induced earthquakes. 
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MAGNETIC MATERIALS 


Chiromagnetic nanoparticles and gels 


Jihyeon Yeom,?” Uallisson S. Santos,? Mahshid Chekini,”* Minjeong Cha,””” 
André F. de Moura,** Nicholas A. Kotov??"*"*6* 


Chiral inorganic nanostructures have high circular dichroism, but real-time control of their 
optical activity has so far been achieved only by irreversible chemical changes. Field 
modulation is a far more desirable path to chiroptical devices. We hypothesized that 
magnetic field modulation can be attained for chiral nanostructures with large 
contributions of the magnetic transition dipole moments to polarization rotation. We found 
that dispersions and gels of paramagnetic Co30, nanoparticles with chiral distortions 

of the crystal lattices exhibited chiroptical activity in the visible range that was 10 times as 
strong as that of nonparamagnetic nanoparticles of comparable size. Transparency of 

the nanoparticle gels to circularly polarized light beams in the ultraviolet range was 
reversibly modulated by magnetic fields. These phenomena were also observed for other 
nanoscale metal oxides with lattice distortions from imprinted amino acids and other 
chiral ligands. The large family of chiral ceramic nanostructures and gels can be pivotal for 
new technologies and knowledge at the nexus of chirality and magnetism. 


ptical materials that combine chirality and 
magnetism are essential for spintronics, 
magneto-optics, magnetochemistry, and 
chiral catalysts (J, 2) because they allow 
modulation of light beams, excited states, 
and chemical processes by means of a magnetic 
field. The junction of chirality and magnetism is 
central to skyrmions, spin catalysis, and the origin 
of homochirality in life on Earth (3-5), represent- 
ing some of the newly emerged and long-standing 
problems of physics, chemistry, and biology. For 
all of these physicochemical phenomena, it is es- 
sential to increase the coupling of the photon’s 
magnetic field with magnetic moments of elec- 
trons in chiral matter, which is expected to mark- 
edly enhance the chiroptical activity and first- and 
second-order magneto-optical phenomena, such 
as the Faraday effect, magnetic circular dichro- 
ism, and magneto-chiral dichroism (6-12). How- 
ever, optical materials that combine a large 
magnetic moment and chiral asymmetry are rare 
(13). The common examples of such materials that 
can be dubbed chiromagnetic are typically based 
on complexes of transition metals (J4, 15). But 
even for optical centers with rare-earth f orbitals 
accommodating multiple unpaired spins, the long- 
distance spin coupling enhancing the magnetic 
field effect on electronic transitions requires low 
temperatures of T = 5 to 7 K (/4, 15). 
Chiral inorganic nanoparticles (NPs) (16-19) 
and their assemblies (72, 20) provide a new toolbox 
for the design of materials combining chirality 
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and magnetism. Whereas the optical transitions 
in rare-earth coordination compounds involve 
localized molecular orbitals, the optical tran- 
sitions in NPs may engage orbitals involving thou- 
sands of atoms, and so does their chirality (27). 
Unlike coordination compounds, the optical “center” 
responsible for chiroptical properties in inorganic 
NPs becomes orders of magnitude greater in vol- 
ume compared with coordination compounds from 
f metals; magnetic coupling between atomic spins 
is also facilitated by shorter distances between 
magnetic atoms. Importantly, the chiral NPs may 
also show distorted crystal lattices (22, 23) and ex- 
hibit (super)paramagnetism (12, 24-26). This set 
of NP characteristics enables enhancement and 
spectral tuning of multiple chiroptical properties. 
(27, 28) In this study, we focus on the first-order 
effects of magnetic field on light absorption and 
circular dichroism (CD). The importance of mag- 
netic properties of NPs for absorption of circularly 
polarized photons can be easily inferred from the 
quantum mechanical parameter known as rota- 
tional strength, Ro,, which can be calculated as 


Roa= Im|[(‘¥o|ft|'‘Pa) - (Ya|aia| Yo) 


= Im[Hoq . M,o| 


(Eq. 1) 
where ‘¥, and 'Y, are the wave functions for the 
ground state (0) and excited state (a), ji andm are 
the corresponding electric and magnetic moment 
operators, and fo, and Myo are the electric and 
magnetic transition dipole moments, respectively 
(6, 9). Equation 1 holds true for any and all quan- 
tum systems, whether chiral or not, whether bear- 
ing unpaired electrons or not, and whether in the 
presence of an external magnetic field or not 
(8, 29). As applied to chiral materials, Eq. 1 is 
usually simplified at the expense of the magnet- 
ic term (6-9, 30). For instance, in the case of 
plasmonic NPs, the magnetic moment term is 
reduced to a small constant, whereas the electric 
moment term is considered to be most essential 
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(6). However, in NPs with a large number of un- 
paired electrons and a chiral crystal lattice, the 
magnetic moment term (‘Y,|m|'Y%o) should have 
a contribution comparable to that of the elec- 
tric moment term (‘Yo|fi|‘¥.), which should lead 
to enhanced Ro, and potentially to practical real- 
izations of NP chirality in magneto-optical devices 
operating at low fields and ambient temperatures. 

Equally importantly, the large CD observed, 
for instance, for chiroplasmonic assemblies was 
difficult to translate to chiroptical devices, because 
all known approaches for real-time modulation 
of their optical activity are associated with ir- 
reversible chemical changes in the NP systems 
(31-37). At the same time, the helical assemblies 
of magnetic NPs are not known in the enantio- 
selective form (38). 

To address this set of fundamental and tech- 
nological questions, we synthesized ~5-nm Co304 
NPs using the L- and p-enantiomers of cysteine 
(Cys) as surface ligands. These NPs serve as the 
primary experimental model in this study and 
will be referred to as p-, L-, and pi-Cys Co3,04 
when the corresponding Cys enantiomers or their 
equimolar mixture were used for NP synthesis. 
The choice of cobalt oxide as the inorganic core 
of the NPs was governed by its known magnet- 
ism and structural versatility, as well as the envi- 
ronmental robustness of cobalt-based ceramics. 
The chemical structure and atomic composition 
of the NPs was established by x-ray photoelectron 
spectroscopy (fig. S1, A to C) and atomic mapping 
(fig. S2) (39). Transmission electron microscopy 
(TEM) and scanning transmission electron micros- 
copy (STEM) indicated the frequent presence of 
NPs with a seemingly amorphous inorganic phase 
(fig. S1, D to F). When the crystal structure was 
observed, the lattice plane distances could be 
adequately described by those in the cubic spinel 
phase. The crystalline domains are confined to 
the central part of the NPs, indicating that the 
seemingly amorphous shells originate from the 
crystal lattice distortions in the vicinity of the NP 
surface caused by molecular imprinting from at- 
tached enantiomers of amino acids. Instead of 
the typical antiferromagnetic behavior of Co30, 
nanostructures (40, 41) owing to exchange inter- 
actions in the spinel lattice, these NPs exhibit 
paramagnetic behavior even below the Néel tem- 
perature expected for Co3O, (fig. S1, G to J); this 
observation confirms that crystal lattice distor- 
tions are characteristic of the vast majority of the 
NPs in the ensemble formed in this synthesis. 

The brown transparent dispersions of these 
NPs yield CD spectra of high intensity, with up to 
eight positive and negative peaks in the ultraviolet 
(UV) and visible range (Fig. 1, A and C) cor- 
responding to various transitions [intraparticle 
Co(II)—Co(III) (230, 280, and 350 nm) and sur- 
face states (including ligands)—>Co(IID (450, 550, 
and 600 nm)], as demonstrated by the simplified 
time-dependent density functional theory (42, 43) 
calculation of a model NP (fig. S7 and table S1). 
The CD spectra for p- and L-Cys Co,0, showed a 
nearly perfect mirror symmetry, whereas Co,0, 
NPs made with equal amounts of p- and 1-Cys 
were chiroptically silent (Fig. 1A). The spectral 
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positions of the CD peaks were nearly perfectly 
aligned with those in absorption (Fig. 1C). Chirop- 
tical anisotropy g-factors as high as 0.02 were 
obtained (Fig. 1B), which is ~10 times those ob- 
tained for other NPs of similar size, including 
plasmonic ones with known strong chiroptical 
activity (44, 45). The high value of the g-factor in 
the visible range can also be appreciated by the 
naked eye as the appearance of a distinct color 
when light passes through the NP-polyacrylamide 
gel and between crossed polarizers (Fig. 1D and 
scheme S1) (32); the green color corresponds to 
the 550-nm peak in the g-factor spectra in Fig. 1B. 

The strong chiroptical activity was also ob- 
served for Co3;0, NPs synthesized and capped by 
L- and p-penicillamine, but CD peaks were at 
different positions (fig. S11) than those for Cys, 
especially for the visible range. Importantly, little 
change in the CD spectra was observed when Cys 
ligands were exchanged for penicillamine after 
the formation of NPs (fig. $12), indicating that 
chiroptical activity in the range of ~400 to’ 700 nm 
is associated with the inorganic core of NPs. Con- 
servation of NP chirality after the ligand exchange 
is consistent with observations of chiral memory 
(46) and the high activation barrier for recon- 
struction of the Co;0, lattice once NPs are formed. 
The kinetic stability of the chiral distortions also 
provides a pathway to other chiral nanostruc- 
tures from CoO, by means of self-assembly using 
ligand-depleted NPs (9). 

Raman scattering spectra further validated the 
chirality of the inorganic cores in the NPs. Char- 
acteristic bands at 380, 475, 516, 613, and 680 cm? 
observed for p- and L-Cys Co30. NPs are associated 
with Raman-active vibration modes of Co304, 
(Fig. 2C) (47). Raman optical activity spectra show 
peaks of opposite polarity at 377, 465, 531, and 
719 cm™ for the NPs carrying opposite enan- 
tiomers of Cys (Fig. 2D). Of particular importance 
is the strong peak at 380 cm” that corresponds to 
lattice phonons of Co,0,. It occurs at frequencies 
higher than expected for Co30, with the cubic 
spinel crystal lattice (47) and shows a distinct 
antisymmetric relation of these vibrations in 
Coz,0, carrying L- and p-Cys (48). Both facts 
indicate the chirality transfer from amino acids 
to the crystal lattice of the inorganic core of the 
NPs, manifesting as crystal lattice distortions (49) 
propagating from the surface of the NPs into the 
core, which can also be seen in STEM images 
(fig. S1, E and F). 

Computational study of atomic-scale dynam- 
ics in Co,0, NPs having either L- or p-Cys on the 
surface were carried out to better understand the 
nature of the chirality transfer and distortions in 
the NP core. Relatively small NPs with Cys residues 
that were coordinated identically with experi- 
mental NPs were used in the simulations (fig. S6). 
In the course of full structural optimization, em- 
blematic chiral geometries with mirror-image 
symmetries independently evolved for L- and 
p-Cys-bearing NPs. Specifically, three ligands on 
each corner of the tetrahedral model NP formed 
ringlike structures with either a clockwise or a 
counterclockwise sense of rotation with respect 
to the C; axis (Fig. 2, A and B). Using the Cahn- 
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Ingold-Prelog system, they can be classified as 
M (clockwise) and P (counterclockwise) enan- 
tiomers. Taking into account the hierarchical 
chirality in these structures that arises from 
superposition of the molecular chirality of the 
amino acids and their orientation on the surface, 
they can be denoted as M-p-Cys and P-1-Cys 
Co304 NPs. 

The normal-mode analysis of the model NP 
indicated that peaks at 505, 529, 601, and 693 cm™ 
correspond to the Raman active breathing mode 
of the inorganic core and ligand-core coupling 
(fig. $14). The experimental peaks in Fig. 2C 
match calculations for the M-p-Cys and P-1-Cys 
NPs well, with many of the bands representing 
the coupled vibrations of the surface atoms and 
ligands (movies S1 to S4). 

In the ab initio molecular dynamics (AIMD) 
simulations, both enantiomers evolved indepen- 
dently, as demonstrated by the energy fluctua- 
tions of each model (fig. S19). Concomitantly, the 
degree of chirality increased by fourfold as 
determined by the Hausdorff chirality measure 
(50) for both the /-p- and P-1-Cys NPs (Fig. 2E), 
indicating that thermal fluctuations increase the 
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degree of distortion of the NPs. Notably, even 
when subjected to all of these vibrational dis- 
tortions and bond reorganizations, the two model 
NPs followed nearly mirrored paths during the 
entire course of the AIMD simulations (Fig. 2F 
and movie S5). 

MD simulations make it possible to follow the 
distortions being caused by the surface ligands in 
the ceramic crystal lattice. The NPs cores carrying 
Land p surface ligands present a pair of nearly 
mirrored structures after 2000 fs of structural 
relaxation (Fig. 3, B and C, and fig. S20, A and B). 
It is possible to analyze selected dihedral angles 
in these structures with respect to the value for 
the ideal crystallographic packing of CoO, cubic 
spinel (Fig. 3A). The three dihedral angles that 
we chose—035.15-8-18) O3512-1-22, ANd $35-13-14-27— 
share O atom number 35, located at the center 
of one of the faces of the NP model (Fig. 3, A 
to C, and fig. $21, C and D), and have values 
$35-15-8-18 = 35-12-11-22 = 035-13-14-27 = O in the 
undistorted cubic spinel of Co,0,. The binding 
of the surface ligands led to noncoplanarity of 
these atoms that was already evident in the energy- 
minimization step, and these distortions increased 
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Fig. 1. Synthesized chiral Co30,4 NPs. (A) Circular dichroism (CD), (B) g-factor, defined as the 
ratio between the molar circular dichroism Ae and the molar extinction coefficient e(g = Ae/e), 

and (C) UV-visible absorption spectra of Co30,4 NPs stabilized by p-Cys, L-Cys, and pL-Cys. 

(D) Photographs of light transmitting through the NPs, with the rotation of the linear analyzer 
counterclockwise (-10°), and clockwise (+10°). (E) TEM image of t-Cys—capped Co30, NPs. mdeg, 


millidegrees; a.u., arbitrary units. 
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during MD simulation. The mutual correlation of 
these angles stemming from the concerted move- 
ment of all the atoms in the nanoscale structures 
can be determined using Ramachandran-like plots. 
Similarly to proteins and other biomolecules, the 
pairwise probabilities of 635-15-8.18, $35-12-11-22, 
and 635-13-14-27 acquiring specific values display 
unmistakable cross-correlation in the MD trajec- 
tories (Fig. 3, D to I). Importantly, the pattern of 
distortion is mirrored for p- and L-Cys NPs. 
These plots and MD simulations show the 
mechanism of chirality transfer in Coz;0, NPs 
that occurs during the growth of the NPs. 

Chiral distortion of the original cubic spinel 
lattice changes the local magnetic fields within 
Co304 NPs because the overlap between atomic 
orbitals of Co®* and O?- and lattice symmetry are 
changed (57). Instantiating this point, the spin 
population of two Co(II) atoms changed from near 
zero to about two (tables S8 and S11) upon geom- 
etry optimization, owing to the distortions. The 
large spin and orbital angular momenta with cor- 
responding operators § and L contribute to the 
chiroptical activity of NPs, according to Eq. 1, via 
magnetic transition dipole moments with the cor- 
responding operator taking the form 

“ —eh 


a = —"(L + gS) 


Eq. 2 
oe (Eq. 2) 


where gz is the gyromagnetic ratio for spin angular 
momentum, m is electron mass, c is the speed of 
light, and / is the reduced Planck constant. Un- 
usually high g-factors can only result from large 
|m| (m-allowed) and small |p| (u-forbidden) and 
either parallel (8 = 0°) or antiparallel (6 = 180°) 
orientations of the transient electrical and dipole 
moments (52). This is indeed the case for bands 
at 500, 550, and 650 nm (Fig. 1B) associated with 
the surface states of NPs, which provides direct 
experimental evidence for a strong contribution 
of magnetic transition dipole moments and para- 
magnetic enhancement of the optical activity of 
Co3,0, NPs. Naked-eye visualization of the large 
g-factors characteristic of Co30, NPs for an opti- 
cal system with crossed polarizers indicates their 
importance for information technologies and 
photonics (Fig. 1D). 

The key role of magnetically coupled un- 
paired electrons of Co atoms for the chiroptical 
activity of NPs was further confirmed by synthe- 
sizing mixed-metal NPs that included Cu?* ions. 
Because the number of unpaired electrons in the 
NPs with Cu?* was smaller than in those with 
neat Co30,, magnetic transition dipole moments 
should be lower according to Eq. 2. Indeed, the 
g-values gradually decreased as the amount of Cu?* 
increased (fig. S22). 

In addition to Co-Cu mixed oxides, the gen- 
erality of the magnetic effects on the chiroptical 
activity of NPs was confirmed for chiral nickel(II) 
oxide NPs that also showed strong optical ac- 
tivity with g-factors up to 0.01 (fig. S23). These 
NPs were similar in size, shape, and magnetic 
properties (figs. S24 to S26) to the Co304 NPs in 
Fig. 1. 

As one would expect from Eqs. 1 and 2, the 
rotatory optical activity of p- and t-Cys NPs 
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could be altered by the external magnetic field. 
To test magnetic field effects, the Coz;0, NPs were 
encapsulated in the transparent polyacrylamide 
gel (Fig. 4C) to avoid variation of optical proper- 
ties owing to translational movement of the 
NPs. The NP gels showed the same optical and 
chiroptical bands at the same wavelengths in 
zero field as the NP dispersion (Fig. 4 and fig. 
$27). The magnetic field effect manifested as 
(i) a dramatic increase in UV transparency for 
circularly polarized light and (ii) its disappear- 
ance for racemic pL-Cys NPs (fig. S28), which is 
fundamentally different than the magnetic cir- 
cular dichroism (MCD) that has been found for 


Au, Ag, or Fe;0, NPs (53-56). Importantly, the 
field-on/field-off ratios for 280-nm NP absorp- 
tion peaks of left and right circularly polarized 
beams and their sum (Fig. 4, A and B) markedly 
exceed the ratios for even the giant Faraday 
rotation found in nanoscale plasmonic systems 
(19, 20, 57) and giant Zeeman splitting (58, 59). 
Also, the chiroptical effects of this magnitude 
were observed for p- and L-Cys NPs at room 
temperature, as opposed to the liquid helium 
temperatures often used for experimental obser- 
vations of chiroptical effects in magnetic nano- 
materials (14, 15) and the magnetooptical effects 
mentioned above. 
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Fig. 2. Rotations of crystalline structures. (A and B) o-Cys (A) and --Cys (B) Co30,4 NPs. Sulfur 
atoms are the larger red spheres forming one corner of the tetrahedra, and the remaining atoms 
depicted in red are C-C—C-O from the Cys ligands. (C and D) Raman (C) and Raman optical activity 
(ROA) (D) backscattering spectra with 532-nm excitation of p-Cys and t-Cys Co30, NPs in scattered 
circular polarization ROA mode. These spectra are courtesy of BioTools. (E) Hausdorff chirality measure 
(HCM) for the NP cores. (F) Dihedral angles between atoms 18, 7, 9, and 22 [O-—Co(IIl)—Co/(IIl)—O] of 
t-Cys and p-Cys Co30,4 NPs. 
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Fig. 3. Computed atomic geometry of chiral nanoparticles. (A to C) Graphical representation of 
the dihedral angles formed by four atoms: 35-12-11-22 (red), $35-17-8.18 (blue), and 35-13-14-27 (green). 
A detailed description of the atomic types and numbering of each dihedral angle may be found 

in fig. S21. (A) NP with ideal crystallographic structure. (B) NP with M-p-Cys after 2000 fs of 

MD simulation. (C) NP with P-.-Cys after 2000 fs of MD simulation. The direction from the 

S atom to the carbonyl atom in the Cys molecules is taken as the basis for the geometry 
classification according to the Cahn-Ingold-Prelog rules. Ligands have been omitted for clarity. 

(D to 1) Ramachadran plots for chiral NPs: two-dimensional probability maps for the relative 
orientation of adjoining octahedra pairs sharing the O atom number 35. Average probabilities were 
computed along the 2000-fs MD simulation for the NP functionalized with either M-p-Cys (left) 
or P-t-Cys (right). The isovalues depicted in the plots are probabilities of 0.0002 (blue), 0.0003 
(green), 0.0004 (yellow), 0.0005 (orange), and 0.001 (red). 


Although these experiments contain compo- 
nents of light-matter interactions from “natural” 
CD, the Faraday effect, and MCD (29), magneto- 
optical Kerr effects are unlikely to have a substantive 
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contribution (39). Magnetic field-induced trans- 
parency for circularly polarized light originates 
in the decrease of the absorption cross section of 
the NPs when the magnetic field is applied in non- 
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cubic crystal lattices (60), and not from plasmonic 
effects; the latter have a different mechanism and 
smaller magnitude (53-55, 67). The chiral distor- 
tions caused by the L- and p-Cys attachment to the 
surface of Coz0,, NPs make the dielectric permit- 
tivity and magnetic susceptibility tensors strongly 
anisotropic. The external magnetic field polarizes 
the paramagnetic NPs so that the spin and orbital 
magnetic moments become preferentially oriented 
in the direction of the field. The effect would be 
small for NPs with crystal lattices of high symmetry 
(e.g., cubic), and their experimental observation 
would require low temperatures to mitigate ther- 
mal broadening of the chiroptical peaks, because 
dielectric permittivity and magnetic susceptibil- 
ity tensors are less anisotropic. In our observa- 
tions, the transparency increase was minimal in 
pi-Cys NPs where chiral distortions were being 
partially compensated (Fig. 1A and fig. S28). The 
effect was also minimal for optical transitions 
involving surface states because the near-spherical 
symmetry of NPs averages the effect of external 
polarization on all the terms in Eq. 1. Consequently, 
the visible part of the CD spectra (>450 nm) of both 
p- and L-Cys NPs experiences little influence of 
the magnetic field, despite the high g-factor 
(Fig. 1B). However, the spectral region associated 
with Co(II)—Co(IID electronic transitions displays 
very strong effects because the external magnetic 
field forces both donor and acceptor states to 
become polarized simultaneously, and that is not 
averaged in the individual NPs or their ensembles. 
A simplified reason behind the reduced absorp- 
tion cross section for the circularly polarized light 
could be the alignment of the transient magnetic 
moment for the Co(II)—Co(II) transition along 
the external magnetic field, which makes it or- 
thogonal to the magnetic moment of incident 
photons, leading to a markedly reduced ampli- 
tude of the (‘Y,|m|‘Y%o) term. 

The strong absorbance drop enables real-time 
optical modulation using a magnetic field, which 
was observed with excellent fidelity for 60 cycles 
(Fig. 4E); no agglomeration of NPs was observed. 
Cyclic switching of the gel transparency in the 
UV absorption band of NPs can also be converted 
into the modulation of photons in the visible 
range by using fluorescent targets (Fig. 4, D and 
F, and scheme 82). 

Structural design of chiral NPs aimed at the 
maximization of the transient magnetic moment 
contribution to the NPs’ circular dichroism re- 
sulted in a large increase of the g-factor in the 
visible range of wavelengths and intense mag- 
netic field-induced light modulation in the UV 
range. Data obtained in this study indicate that 
ceramic NPs with structural chirality and mag- 
netism can be expanded to a large family of nano- 
scale materials with tunable chiroptical, magnetic, 
and other properties, enabled by the well-known 
tolerance of metal oxides to partial metal substi- 
tution. In addition to their technological relevance 
to magneto- and opto-electronic devices, the cera- 
mic chiromagnetic NPs based on metal oxides offer 
a versatile experimental system for different fields 
of science and fundamental problems unified by 
chiral properties of nanoscale matter. 


4 of 6 


810z ‘g} Avenuer uo /Bio' Bewsous!ossouel0s//:diy wo papeojuMOGg 


RESEARCH | REPORT 


D-Cys Co304 NPs 


A 200 — D-Cys Co304 NPs B 3.0 
D-Cys Co304 NPs at 1.4 T 


— D-Cys Co304 NPs at 1.4 T 


150 — L-Cys Co304 NPs Joes L-Cys Co304 NPs 
— L-Cys Co304 NPs at 1.4T < 2.5 L-Cys Co304 NPs at 1.4 T 
100 o 
if) — 
2 50 2 
xe) 0 c 
E 50 s 
QO - ° 
S100 g 
<x 
-150 
-200 0.0 
200 300 400 500 600 700 800 200 300 400 500 600 700 800 
Wavelength (nm) Wavelength (nm) 
C D 
1007" (aT 
—-AAT 
=> — oT 
3 80 
Ss 
> 60 
= 
2 40 
o 
2 
= 20 


0 
380 400 420 440 460 480 500 520 
Wavelength (nm) 


22 _a" i 


Absorbance at 280 nm 


“o 5 10 15 20 25 30 35 40 45 50 55 60 
Cycle number 


- YN wO FU DN © 


Intensity at 415 nm (a.u.) 


012 3 4 5 6 7 8 9 10 11 12 
Cycle number 


Fig. 4. Optical modulation. (A and B) CD and MCD (A) and corresponding absorbance spectra (B) of 
t-Cys and p-Cys Co30, NPs. (C) Photograph of the optically transparent gel made from L-Cys Co304 
NPs. (D) Emission intensities of fluorescent paper plus the NP gel in front, with a magnetic field applied 
to the NP gel (red and blue) and without a magnetic field (green) (excitation, 280 nm). (E) Cycling 
performance of the NP gel’s absorbance at 280 nm with and without magnetic fields. (F) Cycling profile 
of emission intensity at 415 nm with and without magnetic fields and corresponding photographs of 
blue-emitting light from the fluorescent paper. 
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CHEMICAL ENGINEERING 


Digitization of multistep organic 
synthesis in reactionware for 
on-demand pharmaceuticals 


Philip J. Kitson, Guillaume Marie, Jean-Patrick Francoia, Sergey S. Zalesskiy, 
Ralph C. Sigerson, Jennifer S. Mathieson, Leroy Cronin* 


Chemical manufacturing is often done at large facilities that require a sizable capital 
investment and then produce key compounds for a finite period. We present an approach 
to the manufacturing of fine chemicals and pharmaceuticals in a self-contained plastic 
reactionware device. The device was designed and constructed by using a chemical 

to computer-automated design (ChemCAD) approach that enables the translation of 
traditional bench-scale synthesis into a platform-independent digital code. This in turn 
guides production of a three-dimensional printed device that encloses the entire 
synthetic route internally via simple operations. We demonstrate the approach for the 
y-aminobutyric acid receptor agonist, (+)-baclofen, establishing a concept that paves 
the way for the local manufacture of drugs outside of specialist facilities. 


he manufacture of active pharmaceutical 

ingredients (APIs) is vital for modern health 

care, yet critical drugs are regularly manu- 

factured for a finite period in a limited num- 

ber of sites. The manufacture of chemical 
products—whether bulk, fine, or specialty chem- 
icals, such as APIs—is currently based on a model 
whereby a central plant is exclusively designed 
for the manufacture of the product, or range of 
products, sold by that particular company (1). This 
model holds whether the manufacturer is a large 
pharmaceutical company or, as is increasingly the 
case, a contract research organization operating 
large chemical manufacturing plants to order 
from the pharmaceutical industry. This process 
leads to safety issues around both the storage and 
transport of such materials, as well as the issues 
inherent in the large-scale manufacture of chem- 
icals (2). In addition, these large-scale plants are 
often at the mercy of complicated and global sup- 
ply chains of raw materials, the failure of which 
at any point will reduce or halt the capacity of the 
plant to produce materials and deliver them effec- 
tively (3, 4). Also, when a given complex inter- 
mediate or API goes out of production, the plants 
are often repurposed and the manufacturing ca- 
pacity is lost. The reinstatement of the process 
would require, in the best case, substantial capi- 
tal investment to reconfigure a chemical plant for 
its synthesis. To alleviate this issue, we propose a 
concept whereby the large-scale manufacturing 
process of complex fine chemicals, such as APIs, 
is augmented by distributed, point-of-use manu- 
facturing in self-contained cartridges, requiring 
limited user interaction to produce the desired 
products on demand. To achieve this, we devel- 
oped a methodology for the translation of bench- 
scale synthesis procedures into a step-by-step 
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workflow that could be used to create digital de- 
signs for custom reactionware that can be fab- 
ricated by using three-dimensional (3D) printing 
technologies. In this way, we aim to move beyond 
the preserve of industrial manufacturing and 
prototyping applications (5), to revolutionize the 
relationship between the design, manufacture, 
and operation of functional devices (6-17) and 
exploit the increasing use of 3D printing in the 
automation of the chemical sciences (12-15). This 
methodology, which is in stark contrast to both 
large- and medium-scale traditional chemical 
manufacture, and also to the use of continuous- 
flow and microreactor approaches (J, 16, 17), 
allows for the distribution of simple chemical 
precursors and solvents rather than the complex 
products themselves. These precursors could 
then continue to benefit from the economies of 
scale brought by traditional manufacturing pro- 
cesses while complex products with short shelf 
lives, or lower and more distributed demand, can 
be produced locally. This has added benefits in 
terms of manufacture of the final products as 
the synthesis of smaller quantities is inherently 
safer than large-scale processes and poses less 
risk to both operators and infrastructure. Fur- 
ther, the translation of these synthetic approaches 
into a digitally defined format, where the reactor 
design and, eventually, an automated synthesis 
procedure are encoded, could allow the digitiza- 
tion of all chemical products into a very low-cost 
manufacturing format. This could allow large 
numbers of discontinued APIs to be made avail- 
able as they can be brought back into production 
on asmall scale by the fabrication and use of the 
appropriate cartridges (18, 19). 

As a proof of principle, we present a process by 
which the traditional laboratory-scale synthesis 
of a commercially available API can be translated 
into the design of an integrated cartridge. To do 
this, all the reaction steps and intrasynthesis 
purification processes are encoded into the 3D 
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architecture of the cartridge so that the chemi- 
cal reactions, work-ups, and purification are done 
with minimal user intervention and exposure 
automatically. We have demonstrated this pro- 
cess in the full synthesis of the anticonvulsant 
medication (+)-baclofen (see below). 

This method for translating traditional lab- 
oratory syntheses into a form that can be en- 
capsulated in a single cartridge is split into three 
layers of consideration, which were iteratively 
reevaluated during the cartridge development 
process. The first is the “conceptual layer,” where 
the chemical reactions and processes necessary 
are identified and developed. The second is the 
“digital layer,” in which these processes are trans- 
lated into digital 3D models of the final cartridge 
devices. Finally, a “physical layer,” where the dig- 
ital models are realized as either a modular 
implementation or a monolithic implementation, 
is used to generate the finalized cartridge, which 
can be used to effect the designed synthesis (Fig. 1). 
These physical systems can then be tested for ef- 
ficacy as a final implementation, before iterating 
the process to develop reliable cartridge syntheses. 

First, the fundamental chemistry required for 
the transformations is considered and optimized 
to minimize the necessary interstep purification 
for the completion of the full synthesis. This ap- 
proach is similar to that taken to develop tel- 
escoped (i.e., consecutive transformations in a 
single reactor or sequence of reactors without 
isolation and purification of intermediates) and 
“one-pot” syntheses (20, 27), often used in pro- 
cess chemistry, both of which aim to maximize 
the efficiency of the synthetic route. Although here 
it is not necessary to produce genuinely telescoped 
syntheses, as modules for interstep purification 
can be built into the cartridge design, the syn- 
thesis of the desired compound, including all 
reagents and starting materials for all the neces- 
sary steps, must be considered as a unified pro- 
cess. The choice of synthetic route to any target 
compound will be dictated by a number of fac- 
tors, including the relative availability and cost 
of starting materials, reagents, and solvents, as 
well as the compatibility of reaction and purifica- 
tion sequences with the reactor modules produced. 
In any wide-scale application of our approach, a 
cost analysis of any proposed synthetic route will 
have to be performed to ensure its viability for 
the product. Once the chemistry for the synthesis 
is developed, a sequence can be produced where 
the physical processes and reaction parameters— 
such as heating, cooling, phase separations, reac- 
tion volumes, and times—can be identified. 

Vital to the success of these modules is the 
compatibility of the cartridge material with the 
chemistry being performed. Whereas tradition- 
al laboratory syntheses take place mostly in 
glassware, we use polypropylene (PP) as a basic 
structural material for the fabrication of the 
cartridges. We have found that this polyolefinic 
material, while demonstrating a robust range of 
chemical compatibility for traditional synthetic 
organic reactions, is also a suitable substrate for 
3D printing applications (22-24). This gives the 
best balance of chemical resistance and material 
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properties for 3D printing. Therefore, the first 
step in the design process is testing the reactions 
necessary for compatibility in the reactor mate- 
rials. Future iterations of the concept could ex- 
pand on the materials and fabrication processes 
available for the reaction modules to further de- 
velop the range of chemistries feasible in this sys- 
tem, using, for example, perfluorinated polymers 
to increase the chemical resistance of the module. 

To demonstrate the feasibility of incorporating 
these PP reactors into the production of APIs, 
we tested a number of reactions that lead to 
such targets in test modules fabricated from PP 
(Fig. 2). We tested reactions for the synthesis of 
three APIs: the central nervous system inhibitor 
(+)-baclofen (25), the anticonvulsant lamotrigine 
(26), and the gastroprotective agent zolimidine 
(27). As can be seen, all of the reactions tested 
were observed to work, but with slightly lower 
efficiency in PP reactors than in traditional glass 


asf 


T, to V2 


Ty, ty V; 


sep. 


reactors, owing to physical loss of material on the 
relatively rough PP surface hampering product 
recovery. Surface roughness is inherent in the 3D 
printed process; however, use of other, as yet un- 
developed, materials or different manufacturing 
techniques could reduce this issue. The zolimidine 
reactions, particularly the copper-catalyzed iodi- 
nation reaction, experienced a pronounced reduc- 
tion in efficiency, compared to (+)-baclofen or 
lamotrigine. We surmised that this was due to 
side reactions of the iodine with the polypropylene. 
These issues highlight that the process of trans- 
lation from glassware must take into account both 
the physical and chemical properties and limita- 
tions of the reactor substrate (23). For this reason, 
the first two syntheses were selected for further 
development into reaction cartridges to explore 
the concept. 

Once the processes needed for the reaction se- 
quence are identified, the combined continuous 


protocol is mapped onto the 3D digital designs 
for the target-specific cartridge. The sequence of 
processes is split into a series of modules, with 
each representing a telescoped series of processes 
that can take place in a single chamber of the 3D 
printed system. Each process module is then 
created as a digital model that can be manipu- 
lated to provide the correct physical dimensions 
necessary for the reaction scheme. The 3D mod- 
els of the cartridges used in this study were 
created with OpenSCAD software, an open-source 
framework for CSG (constructive solid geometry) 
modeling that allows a highly flexible and con- 
figurable approach to create versatile libraries 
of components as reusable pieces of code. Once 
defined, these pieces of code can be manipulated 
by the software, allowing the generation of com- 
plex reactor geometries with minimal human in- 
puts. For example, in this study, we designed a 
module library consisting of interchangeable top 
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Fig. 1. Schematic representation of the translation of a multistep 
synthesis from conception through to implementation as a reaction 
cartridge. Reactions necessary for the synthesis are identified (A+B-+C-D, 
top left panel) and the specific chemical and physical processes and 
reaction parameters necessary for each reaction are laid out (conditions 

i. -ili., left panel). These processes are then translated into bespoke reaction 
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Parametric Module Design 


Solvent 

Evaporation 
t, - Reaction Time (h) V, - Reaction vol. (ml) 
H®, - Reactor Height (mm) V%, - Reactor vol. (ml) 


modules designed to accomplish one or more of the chemical processes 
identified in the previous step (top right panel). The modules are then 
designed as 3D CAD models (lower center panel), with libraries of module 
components to accommodate the required reaction parameters. These 
digital models can then be fabricated to produce either a modular or 
monolithic implementation (lower right panel) of the process. 
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Fig. 2. Comparison of glass reactors with plastic reactionware for the optimized synthetic routes to (+)-baclofen (top), lamotrigine (middle), 
and zolimidine (bottom) with reaction yields for each step (reaction yields in PP vessels given in parentheses). Single (top right) or double 
(bottom right) chambered polypropylene reaction test cartridges were used. PP, polypropylene; TBAF, tetrabutylammonium fluoride; THF, tetrahydrofuran. 


and bottom components with varying features 
that can be easily combined to produce reaction 
vessels with different shapes and features. From 
a single line of code, an entire module can be 
created, with 18 different shapes available (i.e., 
three different tops and six different bottoms can 
be selected; Fig. 3). The modules were designed 
around simple chambers where each reaction or 
process could be performed in as close a manner 
as possible to the way it would be carried out with 
traditional batch chemical techniques, easing the 
transition between published synthesis in glass- 
ware and “cartridge” synthesis. Typically, a stan- 
dard module would have an opening on the top 
of the wall of the chamber for transfer of reaction 
mixtures from previous modules and an opening 
at the bottom of the chamber for expelling mate- 
rial from the module subsequent to the completion 
of the desired process. The transfer of material 
between modules is facilitated by a further open- 
ing in the roof of the compartment, which can be 
used to apply pressure that forces the reaction 
medium out of the chamber via the outlet at the 
bottom. The opening at the top otherwise equal- 
izes pressure throughout the device to prevent the 
premature transfer of material, and also allows 
for application of vacuum to remove and exchange 
solvents. These modules can then be combined 
in sequences by use of further components of 
our module library such as siphon tubes for the 
transfer of material from one reaction module 
to another. 
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Once a reaction chamber is created, new fea- 
tures can be introduced by subtracting or adding 
shapes to the module. For example, a filtration 
device can be made from a module with a top in- 
put, around bottom with a port, and a glass filter. 
To achieve this feature, a cylindrical model con- 
forming to the dimensions of the physical filter to 
be inserted is created and subsequently subtracted 
from the model of a reaction chamber, producing 
avoid space in the model into which the filter fits 
(see supplementary materials). Phase separation 
modules were achieved in a similar manner by 
using hydrophobic frit inserts that effectively 
separate organic and aqueous phases for product 
extractions. In keeping with our desire to design 
synthesis cartridges that can be produced outside 
traditional manufacturing regimes, we have ex- 
ploited our group’s development of 3D printed 
reactors—reactionware—for synthetic chemical 
applications as a method of prototyping the phys- 
ical reactors (28, 29). Three-dimensional printing- 
based fabrication approaches have the added 
advantage of being intimately linked to the de- 
sign process. 

Fabrication of the modular system was carried 
out on low-cost (~$2000) 3D printers, Ultimaker 
2 and 2+, although many other fused deposi- 
tion modeling (FDM) printers could print the 3D 
modules produced through this approach. If it 
is necessary to incorporate nonprinted materials 
during 3D printing of the final module, a pre- 
programmed pause in the printing process is 
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instigated at a point just above the designed 
void, and the component is inserted in this space 
before the resumption of printing. Upon comple- 
tion of printing, the inlet and outlet ports were 
tapped with a % inch unified national fine (UNF) 
thread to allow ease of integration with the ex- 
ternal infrastructure for performing the reaction 
sequences. Using standard ports allowed us to 
attach either standard fluidic tubing connectors 
such as those found in traditional flow synthesis 
setups, or widely used Luer lock adapters. These 
Luer lock connectors are easily reconfigurable, 
facilitating feedback into the design process. 
The API chosen to accomplish a complete end- 
to-end synthesis was the central nervous system 
depressant and antispastic medication (+)-baclofen 
(30, 31) LRS-B-(4-chlorophenyl)-y-aminobutyric 
acid] (4) (Fig. 4), a derivative of y-aminobutyric 
acid (GABA) that modulates the action of this 
central inhibitory neurotransmitter (25). This 
target was chosen as an example to demonstrate 
that even relatively short syntheses require a dis- 
proportionately larger set of chemical processing 
steps to effect the full synthesis; in the future, we 
envision that the synthesis of larger numbers of 
compounds and compound classes will greatly 
expand the scope of this approach. (+)-Baclofen 
has found a number of applications since its first 
reported synthesis and is currently being inves- 
tigated beyond its traditional use, as a high-dose 
treatment for alcoholism (32). Many syntheses of 
(+)-baclofen have been published since it was first 
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reported, often proceeding through the formation, 
and subsequent hydrolysis, of B-(4-chlorophenyl)- 
y-butyrolactam (3). We have modified such a 
traditional synthesis of (+)-baclofen starting 
from the commercially available material methyl 
4-chloro-cinammate (1), and proceeding via the 
Michael addition of nitromethane to form 4-nitro- 
3-(4-chlorophenyl)butanoic acid (2), followed by 
nickel-catalyzed reductive lactamization and sub- 
sequent acid hydrolysis to produce the final pro- 
duct in its commercially available racemic form 
as a hydrochloride salt. This three-reaction-step 
sequence contains 12 individual processing steps 
that must be incorporated into the reactionware 
device to complete the synthesis (Fig. 4). This 
sequence was designed to be particularly amena- 
ble to translation into the modular or monolithic 
system as at each stage, the reactions are either 
sufficiently clean, or reaction impurities that would 
impinge on subsequent process in the synthesis 
could be readily removed by phase partition. The 
final product is purified through a methanol- 
diethyl ether crystallization, which yields a crystal- 
line solid that can be retrieved directly 

from the cartridge device. An anima- 


(31.8 ml) with sufficient volume to accommodate 
the reaction volumes and extraction solvents from 
the previous processes before concentration under 
reduced pressure. Extraction module (e) consists of 
a chamber of sufficient volume (4.7 ml) to con- 
tain the aqueous phase from the previous cham- 
ber, which has a drain at the bottom covered by a 
hydrophobic frit that prevents both solid mate- 
rial and aqueous solution from passing into the 
next chamber or module. The final module is a 
filtration module for separating and retrieving 
the final product. This single module can be either 
open to the atmosphere or enclosed as required. 
During the fabrication process, chambers or mod- 
ules that required stirring were equipped with a 
PTFE (polytetrafluoroethylene)-coated magnetic 
stirring bead (length 10 mm) to enable mixing 
of the contents. Each module or chamber of 
the monolith was equipped with a 4 inch UNF 
threaded port carrying a female Luer lock adapter, 
which was used to introduce an inert (dry, N») 
atmosphere, or reduced pressure, into the system. 
The modular system was designed such that there 


was a single fluidic path through the reactor; flow 
from one chamber into the next was induced ei- 
ther by pressure from excess solvent, in the case 
of the phase separation processes, or the introduc- 
tion of nitrogen pressure difference between the 
relevant chambers to push the reaction mixture 
through an embedded channel running from the 
bottom of one chamber to the top of the next. 
Starting materials were prepared as simple so- 
lutions and transferred to the cartridge via standard 
Luer syringes. The cooling and heating required 
for the reaction sequence were achieved by the 
immersion of the reaction cartridge or module in 
an ice or sand bath, respectively, and the temper- 
ature required for the reactions can be achieved 
automatically on a stirring-hotplate. The exact 
sequence of operations, positioning of the mod- 
ule in the heating or cooling bath, and time inter- 
vals necessary for completing the synthesis are 
outlined in the supplementary materials (figs. S12 
and S13 and table S3). 
Performing the synthesis starting from 200 mg 
of 1 in the manner described yielded 98 mg 
(39% yield over three reaction steps 
and 12 processing steps from 1 with 


tion of the entire process, showing the 295% purity as determined by high- 
passage of reagents, processes, and performance liquid chromatography) 
work-ups, is shown in movie S1. 7) Top of (+}baclofen hydrochloride salt, which 

Each of these processes was trans- = is more than 1 day’s maximum dosage 
lated into operations that could be © e e e of the drug. Better efficiency of reac- 
successfully embodied in one or more QAR e r) e tion can be achieved with lower con- 
reaction or purification modules. The a e e e centrations of starting materials (using 
specific reaction modules used for the c a similar cartridge at half concentration, 
synthesis of (+)-baclofen were (a) a com- oO ie., 100-mg scale, gave a 44% yield over 
bined Michael addition, evaporation 5 Round three steps of similar purity). Increasing 
and ether extraction module; (b) a com- QO Bottom the volume of the reactor as well in- 
bined solvent exchange and reduction = creases the quantity of (+)-baclofen 
module; (c) a phase separation and fil- S obtained [a 300-mg scale synthesis 
tration module; (d) a combined solvent =, yielded 133 mg (35%) (+)-baclofen]. 
exchange and hydrolysis module; and & Flat The integration of the reaction pro- 
(e) a filtration module. Individual mod- rs) B i cessing steps into the design of the 
ules were fabricated for a “plug-and- fal ror modules greatly simplifies the opera- 
play” approach to the reaction process tions required to perform the reaction 
development by using Luer lock fit- No Internal External sequence compared to traditional bench 
tings to connect individual modules Outlet Outlet Outlet synthesis and simultaneously reduces 
and Luer taper-compatible valves to the level of technical skills required to 
interface with pressure or vacuum sys- = perform the process down to simple 
tems. This design allowed testing of operations that do not require the spe- 
each individual process in isolation cific skills of a trained synthetic chem- 
before the modules were combined to ist. Although the total time for the 
build up the full synthesis. Finally, the t reaction sequence is around 40 hours 
module designs were “digitally stitched ] in this case, including all intermediate 
together” by using the developed CAD operations, the workflow is constrained 
libraries for internal fluidic pathways by the geometry of the device, so all 
to create the design for a monolithic human interaction is limited to simple 
synthesis cartridge. Once fabricated, interventions at specific time periods, 
the individual modules and the mono- Large Ether DCM ; ; and it should be possible to shorten 
lithic cartridges were evacuated and Reaction Extraction Extraction Eee the interaction time further. The use 
filled with a nitrogen atmosphere to Module Module Module odule of such bespoke, single-use cartridges 


ensure an inert environment for the 
subsequent chemistry. 

The first chamber, (a), consists of a 
lower volume (4.9 ml) where the initial 
reaction can take place and is separated 
from the upper outlet by a hydrophobic 
frit. Reactor modules (b) and (d) consist 
of a single unbroken reaction chamber 
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Fig. 3. Parameterized approach to the design of individual process 
modules. Digital libraries of module components (top) can be easily 

assembled to produce a wide range of module geometries dictated by 
the specific process and reaction parameters (e.g., solvent volumes, 

number of inputs and outputs, etc.) (bottom). Hydrophobic filters for 
phase separation are shown in red, and fritted glass filters are shown in 
blue. DCM, dichloromethane. H®, reactor height. 


19 January 2018 


would greatly reduce the time spent on 
glassware preparation, liquid handling, 
and other ancillary tasks associated 
with the majority of chemical synthe- 
ses at this scale. Also, by using the 
geometry of the reactor to constrain 
the operation of the synthesis, we 
reduce the human decision making 
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involved in the synthesis processes, making 
the sequence more reproducible. Given sufficient 
facilities, several instances of the synthesis car- 
tridge could be used at once, achieving scalability 
by numbering-up arrays of cartridges, and using 
these in parallel to increase the output. As a re- 
sult of the ability to parameterize and encode 
multistep organic synthesis reactions with work- 


ups embedded, we envisage that a digital pro- 
grammable universal heater-stirrer-solvent-reagent 
plug-and-play device can be constructed into which 
only the cartridge, specific to a given synthesis, 
can be plugged in. 

The (+)-baclofen synthesis necessitated liquid 
handling and separation of reaction chambers 
to effect the full reaction sequence. In some cases, 


T, - Reaction Temp. (°C) 


Fig. 4. Synthesis of (+)-baclofen in a series reaction cartridges. (Top) 
Conceptual synthetic procedure for the synthesis of (+)-baclofen under 
the conditions described in Fig. 2, showing the necessary processing 
sequence to effect this synthetic pathway. These processes were then split 
into modules (a) to (e) (indicated by gray boxes in the process sequences), 
which we translated into a digital design (middle left) and finally fabricated as 
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however, syntheses can be conducted in single 
reaction cartridges, depending on the nature and 
quality of the interstep purification required. For 
example, the synthesis of lamotrigine (Fig. 2) can 
be achieved in a single cartridge as the interme- 
diate material is insoluble in the reaction solvent 
at low temperatures. In a single, closed, filtra- 
tion module, the initial reaction product could be 


iT] 


t, - Reaction Time (h) V, - Reaction vol. (ml) 


external 


port pe | 


either a modular (middle right) or monolithic (bottom left) implementation. 
A partially fabricated monolithic cartridge is also shown indicating the 
placement of non-3D printed components and internal fluidic pathways 
(bottom center and right). Both modular and monolithic cartridges are shown 
with Luer taper—compatible valving for interfacing with external fluidic 
inputs and pressure or vacuum lines. 
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washed and processed in situ before introduc- 
tion of the solvent for the subsequent cyclization 
step. This stands in contrast to the traditional 
procedure, which requires the solid product of the 
first step to be removed from the initial reactor 
to be filtered, dried, and then reintroduced to a 
reactor for the second step of the synthesis. Per- 
forming the synthesis of lamotrigine on a 250-mg 
scale of starting material yields 112 mg (46% over 
two reaction steps) of the final product, giving an 
off-white crystalline powder. 

The digital approach to the design of the sys- 
tem that we have adopted allows the blueprints 
for these cartridges to be stored electronically for 
implementation as and when required. The dis- 
tribution model for fine and specialty chemicals, 
such as the APIs implied by this approach, would 
lead to a decentralizing of logistical approaches 
to chemical manufacture. Here, any location with 
access to a sufficiently diverse market of chemical 
precursors and suitable cartridge fabrication facil- 
ities could be used to produce chemical products, 
which could previously be achieved only in a fully 
equipped synthesis laboratory with highly trained 
staff. This approach not only holds promise for 
eventually delivering on-demand personalized 
medicines manufactured at, or near, the point of 
use, but also has short-term potential applications 
in the synthesis of APIs that are currently out of 
production. An immediate impact of digitization 
is that the cost for synthesis at the bench scale 
(milligrams) could decrease markedly owing to 
savings in labor and infrastructure with only a 
one-off digitization cost (and allow operators to 
make 5 to 10 different products at the same time). 
Once the saving meets the digitization cost, the 
efforts of the expert chemist will shift from be- 
spoke on-demand chemical manufacturing to 
chemical digitization (see supplementary mate- 
rials for an economic analysis). Our methodology 
will have the most rapid impact for chemicals 
that are currently produced on demand in small 
batches and that occupy a gap in the market 
where the demand for a product is sufficient for 
it to be commercially viable but insufficient to 
justify plant-scale production. This gap lies be- 
tween the high cost of bench-scale versus reactor- 
scale synthesis, and thus the digitization benefit 
of compounds in this zone is high. 

The regulatory framework necessary to pro- 
duce complex materials in this fashion will need 
thorough attention; indeed, our approach would 
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require a completely new system for the regu- 
lation of API manufacture. This system would 
have to be developed alongside the evolution of 
this approach as a method for pharmaceutical 
synthesis, which we have presented here in proof- 
of-concept form; however, we can envision a situa- 
tion in which regulatory agencies certify specific 
cartridge or module designs as soon as a digitized 
process is fully established (including the embed- 
ded quality-control protocols), independent of the 
physical location of person who uses the cartridge. 
This approach has multiple benefits. First, the 
framework can adopt well-established methods 
of digital object certification from the informa- 
tion technology universe (e.g., digital signing with 
asymmetric ciphers). Second, no explicit certifi- 
cation would be needed for each new “facility” 
(which might be a hospital or a private house) 
that would need the drug. Third, existing methods 
for protecting and manipulating digital content 
provide much more efficient models for distribu- 
tion and regulation compared to the retail and 
patent system, respectively. These regulatory is- 
sues surrounding the commercial or clinical appli- 
cation of this approach are not trivial, and care 
must be taken to ensure that end-user safety is 
not compromised. However, we believe that the 
benefits in terms of efficiency of delivery, robust- 
ness of supply, and range of materials available 
could lead to the digitization of chemical synthesis. 


REFERENCES AND NOTES 


1. D.M. Roberge, L. Ducry, N. Bieler, P. Cretton, B. Zimmermann, 
Chem. Eng. Technol. 28, 318-323 (2005). 
2. D.A. Crowl, J. F. Louvar, Chemical Process Safety: 
Fundamentals with Applications (Pearson Education, 2001). 
3. M.A. Ehlen, A. C. Sun, M. A. Pepple, E. D. Eidson, B. S. Jones, 
Comput. Chem. Eng. 60, 102-111 (2014). 
4. G.E. Applequist, J. F. Pekny, G. V. Reklaitis, Comput. Chem. 
Eng. 24, 2211-2222 (2000). 
5. D. Dimitrov, K. Schreve, N. de Beer, Rapid Prototyping J. 
12, 136-147 (2006). 
6. F. Rengier et al., Int. J. Comput. Assist. Radiol. Surg. 5, 335-341 (2010). 
7. X.M. Liet al. Int. J. Polym. Sci. 2014, 829145 (2014). 
8. S. J. Hollister, Nat. Mater. 4, 518-524 (2005). 
9. S. Hong et al., Adv. Mater. 27, 4035-4040 (2015). 
0. M. Bogers, R. Hadar, A. Bilberg, Technol. Forecast. Soc. Change 
102, 225-239 (2016). 
1. R. Bogue; R. Bogue, Assem. Autom. 33, 307-311 (2013). 
2. S. V. Ley, D. E. Fitzpatrick, R. J. Ingham, R. M. Myers, Angew. 
Chem. Int. Ed. 54, 3449-3464 (2015). 
. B.C. Gross, J. L. Erkal, S. Y. Lockwood, C. Chen, D. M. Spence, 
Anal. Chem. 86, 3240-3253 (2014). 
4. B. Gross, S. Y. Lockwood, D. M. Spence, Anal. Chem. 89, 57-70 (2017). 
. O. Okafor et al., React. Chem. Eng. 2, 129-136 (2017). 
. N. Kockmann, M. Gottsponer, B. Zimmermann, D. M. Roberge, 
Chemistry 14, 7470-7477 (2008). 


ot) 


no 


19 January 2018 


17. A. Adamo et al., Science 352, 61-67 (2016). 

18. J. M. Pearce, in Open-Source Lab (Elsevier, Boston, 2014), pp. 1-11. 

19. M. Coakley, D. E. Hurt 3rd, J. Lab. Auton. 21, 489-495 
(2016). 

20. C. Vaxelaire, P. Winter, M. Christmann, Angew. Chem. Int. Ed. 
50, 3605-3607 (2011). 

21. Y. Hayashi, Chem. Sci. 7, 866-880 (2016). 

22. P. J. Kitson, S. Glatzel, L. Cronin, Beilstein J. Org. Chem. 

12, 2776-2783 (2016). 

23. P. J. Kitson, R. J. Marshall, D. Long, R. S. Forgan, L. Cronin, 

Angew. Chem. Int. Ed. 53, 12723-12728 (2014). 

24. P. J. Kitson, M. D. Symes, V. Dragone, L. Cronin, Chem. Sci. 

(Camb.) 4, 3099-3103 (2013). 

25. M. Da Prada, H. H. Keller, Life Sci. 19, 1253-1263 (1976). 

26. A. Fitton, K. L. Goa, Drugs 50, 691-713 (1995). 

27. L. Almirante et al., J. Med. Chem. 8, 305-312 (1965). 

28. M. D. Symes et al., Nat. Chem. 4, 349-354 (2012). 

29. P. J. Kitson et al., Nat. Protoc. 11, 920-936 (2016). 

30. P. Camps, D. Mufioz-Torrero, L. Sanchez, Tetrahedron 
Asymmetry 15, 2039-2044 (2004). 

31. A. Mann et al., J. Med. Chem. 34, 1307-1313 (1991). 

32. G. Addolorato et al., Alcohol Alcohol. 37, 504-508 
(2002). 


ACKNOWLEDGMENTS 


We acknowledge the help of S. Marshall in compiling our analysis 
of the economic impact of our methodology. Supplementary 
materials include a PDF document detailing the materials and 
methods used in this article; the STL and OpenSCAD files used 
to generate all 3D printed objects mentioned; a schematic movie 
illustrating the process of baclofen synthesis in the monolithic 
cartridge; and Python code that can be used to automate the 
stirrer-hotplate operations for the baclofen synthesis. We gratefully 
acknowledge financial support from the Engineering and Physical 
Sciences Research Council (grant nos. EP/HO24107/1, EP/ 
J015156/1, EP/KO21966/1, EP/LO15668/1, EP/L023652/1) and 
European Research Council (project 670467 SMART-POM). This 
research was developed with funding from the Defense Advanced 
Research Projects Agency (DARPA). The views, opinions and/or 
findings expressed are those of the author and should not be 
interpreted as representing the official views or policies of the 
Department of Defense or the U.S. Government. L.C. is the founder 
and director of CroninGroupPLC and is listed as an inventor on a 
patent application filed by The University of Glasgow (GB 1800299.8). 
L.C. conceived the initial concept and the design approach; P.J.K. 
designed the reactionware with help from J.P.F. and S.Z.; G.M. and 
R.C.S. telescoped the methods, porting them from glass to plastic; 
and P.J.K. developed the monolithic cartridges with help from 
S.Z. J.S.M. helped evaluate the purity of the products, and P.J.K. 
coordinated the team with help from L.C. 


SUPPLEMENTARY MATERIALS 


www.sciencemag.org/content/359/6373/314/suppl/DC1 
Materials and Methods 

Figs. S1 to S16 

Tables S1 to S4 

Movie S1 

OpenSCAD libraries.zip 

STL files.zip 

Python Code 


10 July 2017; accepted 1 November 2017 
10.1126/science.aao3466 


6 of 6 


810z ‘g} Avenuep uo /Bio' Beweouelossouel0s//:diy wo papeojuMOGg 


RESEARCH 


BIOGEOGRAPHY 


A global atlas of the dominant 
bacteria found in soil 
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The immense diversity of soil bacterial communities has stymied efforts to characterize 
individual taxa and document their global distributions. We analyzed soils from 

237 locations across six continents and found that only 2% of bacterial phylotypes 
(~500 phylotypes) consistently accounted for almost half of the soil bacterial 
communities worldwide. Despite the overwhelming diversity of bacterial communities, 
relatively few bacterial taxa are abundant in soils globally. We clustered these dominant 
taxa into ecological groups to build the first global atlas of soil bacterial taxa. Our study 
narrows down the immense number of bacterial taxa to a “most wanted” list that will 
be fruitful targets for genomic and cultivation-based efforts aimed at improving our 
understanding of soil microbes and their contributions to ecosystem functioning. 


lIthough soil bacteria have been studied for 

more than a century, most of the diversity 

of soil bacteria remains undescribed. This 

is unsurprising given that soil bacteria 

rank among the most abundant and di- 
verse group of organisms on Earth (1-4), chal- 
lenging our capacity to understand their specific 
contributions to ecosystem processes, including 
nutrient and carbon cycling, plant production, 
and greenhouse gas emissions (7-3). Put simply, 
characterizing the ecological attributes (environ- 
mental preferences and functional traits) of the 
thousands of bacterial taxa found in soil is un- 
feasible. Most soil bacteria do not match those 
found in preexisting 16S ribosomal RNA (rRNA) 
gene databases (5), we have genomic informa- 
tion for relatively few of them (5-7), and the ma- 
jority of soil bacteria have not been successfully 
cultivated in vitro (6, 7). For these reasons, we 
lack a predictive understanding of the ecolog- 
ical attributes of most soil individual bacterial 
taxa, with their environmental preferences, traits, 
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and metabolic capabilities remaining largely 
unknown. 

Previous work has shown that only a small 
fraction of soil bacteria is typically shared between 
any pair of unique soil samples (4, 8, 9). However, 
we also know that, as with most “macrobial” com- 
munities (70), not all bacterial taxa are equally 
abundant in soil. There are often subsets of soil 
bacterial taxa that are far more abundant than 
others. For example, the genus Bradyrhizobium 
has been found to be dominant in forest soils from 
North America (7). Similarly, a lineage within 
the class Spartobacteria was found to be highly 
abundant in undisturbed grassland soils (72). 
Perhaps more important, many individual taxa 
that are highly abundant in individual soil sam- 
ples may also be abundant across distinct soil 
samples, even when those soil samples are from 
sites located far apart (e.g., Candidatus Udaeobacter 
copiosus) (13). Therefore, a critical and logical next 
step to advance our understanding of soil bac- 
terial communities is to identify the dominant 
bacterial phylotypes that are abundant and ubiq- 
uitous across soils, and determine their ecologi- 
cal attributes. 

From the large body of literature using marker 
gene sequencing to characterize soil bacterial 
communities, we know which major phyla tend 
to be more abundant in soil (14) and we have a 
growing understanding of how various factors, 
including soil properties (e.g., pH) (15), climate 
(9, 16), vegetation type (17), and nutrient avail- 
ability (78), structure the composition of soil bac- 
terial communities worldwide. What is currently 
missing is a detailed ecological understanding 
of common soil bacterial species, which we refer 
to as phylotypes (as bacterial species definitions 
can be problematic) (19). Understanding the eco- 
logical attributes of dominant phylotypes will 
increase our ability to successfully cultivate them 
in vitro and allow us to build a more predictive 
understanding of how soil bacterial commu- 
nities vary across space, time, and in response 
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to anthropogenic changes. For example, if we 
could identify those dominant phylotypes with 
strong preferences for a given set of environ- 
mental conditions (e.g., low or high pH), we could 
then use this information to predict their distri- 
butions and enrich for these dominant phylotypes 
in vitro. Ultimately, a better understanding of 
dominant soil bacterial taxa will improve our 
ability to actively manage soil bacterial commu- 
nities to promote their functional capabilities. 

Here we conducted a global analysis of the 
bacterial communities found in surface soils from 
237 locations across six continents and 18 coun- 
tries (fig. S1) to (i) identify the most dominant 
(i.e., most abundant and ubiquitous) soil bacte- 
rial phylotypes worldwide; (ii) determine which 
of these dominant phylotypes tend to co-occur 
and share similar environmental preferences; 
(iii) map the abundances of these ecological clus- 
ters of dominant soil bacteria across the globe; 
and (iv) assess the genomic attributes that dif- 
ferentiate phylotypes with distinct environmen- 
tal preferences. The soils included in this study 
were selected to span a wide range of vegetation 
types, edaphic characteristics, and bioclimatic 
regions (arid, temperate, tropical, continental 
and polar) (20). 

We first identified the most dominant bacte- 
rial phylotypes by 16S rRNA gene amplicon se- 
quencing (20). Dominant phylotypes (taxa that 
share >97% sequence similarity across the ampli- 
fied 16S rRNA gene region) include those that 
are highly abundant (top 10% most common 
phylotypes sorted by their percentage of 16S 
rRNA reads) (27) and ubiquitous (found in more 
than half of the 237 soil samples evaluated) (20). 
Not surprisingly, our global data set comprised 
bacterial communities that were highly variable 
with respect to their diversity and overall compo- 
sition (fig. S2). For example, observed phylotype 
richness ranged from 774 to 2869 phylotypes 
per sample, and there was a large amount of 
variability in the relative abundances of major 
phyla across the studied sites (fig. S2). Also, as 
expected, only a small fraction of phylotypes 
was found to be shared across soil samples, and 
most phylotypes were relatively rare (fig. S3). 
Based on our criteria, only 2% of the bacterial 
phylotypes (511 out of 25,224 phylotypes) were 
dominant (Fig. 1A and table S1). However, this 
small number of phylotypes accounted for, on 
average, 41% of 16S rRNA gene sequences across 
all samples (Fig. 1A), although they collectively 
accounted for more than half of the bacterial 
communities in some environments (e.g., forests 
from arid environments; Fig. 1B). In other words, 
most soil bacterial phylotypes are rare and rel- 
atively few are abundant, but many of these are 
found across a wide range of soils. 

Notably, 85% of the dominant phylotypes iden- 
tified from our data set were also found to be 
dominant in the bacterial communities recovered 
from 123 global soils that were analyzed using a 
shotgun metagenomic approach (20) (table S1). 
This cross-validation indicates that our list of 
dominant phylotypes is not biased by polymer- 
ase chain reaction amplification or by our choice 
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of primers, as most of the identified dominant 
phylotypes were shared between two indepen- 
dent sets of soils analyzed using two different 
approaches (amplicon versus shotgun metage- 
nomic sequencing). In addition, we compared 
the results from our sample set with those soils 
analyzed via amplicon sequencing as part of 
the Earth Microbiome Project (EMP) (22). The 
majority of the dominant phylotypes in the 
EMP data set (80%)—identified using the same 
criteria explained above—were included within 
our list of dominant taxa (>97% similarity) (20). 
Also, the top 511 phylotypes, comparable to our 
top 511 dominant taxa, accounted for 0.5% of all 
bacterial phylotypes and 41% of all 16S rRNA 
gene reads in the EMP data set. Despite impor- 
tant methodological differences between the two 
data sets (20), this concordance between the 
results from EMP and our study reinforces our 
conclusion that a relatively small subset of bac- 
terial phylotypes dominate soils across the globe. 

On average, the dominant bacterial phylotypes 
identified from our data set were highly abun- 
dant in soils across multiple continents, eco- 
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Fig. 1. Abundance and composition of dominant soil bacterial 
phylotypes across the globe. (A) Percentage of phylotypes and relative 
abundance of 16S rRNA genes representing the dominant versus the 
remaining bacterial phylotypes. (B) Relative abundance (mean + SE) of 
dominant phylotypes across continents and ecosystem types. Ecosystem 
type classification followed the Képpen climate classification and the 
major vegetation types found in our database. Grasslands include both 
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system types, and bioclimatic regions (Fig. 1B). 
The only exception was soil from tropical forests, 
where the dominant phylotypes accounted for 
only ~20% of 16S rRNA gene sequences, which 
is likely a product of soils from tropical forests 
being under-represented in our database and/or 
tropical forest bacterial communities being very 
distinct from those found in other ecosystem 
types (fig. S4). Together, our results suggest that 
soil bacterial communities, like plant commu- 
nities (0), are typically dominated by a relatively 
small subset of phylotypes. As such, we focus all 
downstream analyses on the 511 phylotypes found 
to be the most abundant and ubiquitous in soils 
from across the globe. 

The identified dominant phylotypes accurately 
predicted overall patterns in B-diversity for the 
“subdominant” component of the bacterial com- 
munities surveyed (98% of phylotypes; figs. S2 and 
S65 and Fig. 1C). That is, patterns in the distribu- 
tion of the dominant bacterial phylotypes across 
the globe closely mirrored those observed for the 
remaining 98% of bacterial phylotypes. The most 
abundant and ubiquitous of these 511 phylotypes 
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Africa 5 


Dry grasslands | 
Shrublands J 


included Alphaproteobacteria (Bradyrhizobium 
sp., Sphingomonas sp., Rhodoplanes sp., Devosia 
sp., and Kaistobacter sp.), Betaproteobacteria 
(Methylibium sp. and Ramlibacter sp.), Actinobac- 
teria (Streptomyces sp., Salinibacterium sp., and 
Mycobacterium sp.), Acidobacteria (Candidatus 
Solibacter sp. and order iii1-15), and Planctomy- 
cetes (order WD2101) (see table S1 for a complete 
list). Notably, less than 18% of the 511 phylotypes 
that we identified had a match to an available 
reference genome at the >97% 16S rRNA se- 
quence similarity level, the level commonly used 
for delineating different bacterial species (23) 
(Fig. 2 and table S1). Approximately 42% of the 
dominant 511 phylotypes had no genome match 
even at the >90% 16S rRNA sequence similarity 
level, indicating that we do not have genomic 
information for taxa even within the same genus 
or family (Fig. 2A and table S1). Further, only 
45% of the identified 511 dominant phylotypes 
are related to cultivated isolates and <30% of the 
phylotypes have representative type strains at 
the >97% sequence similarity level (Fig. 2B and 
table S1), which emphasizes the limited amount 


(60) 


Relative abundance (% all reads) 


tropical and temperate grasslands. Shrublands include polar, temperate, 
and tropical shrublands. The number of samples in each category 

is indicated in parentheses. (C) The taxonomic composition of the 
dominant phylotypes. The phylotypes assigned to the least abundant 
phyla are not shown (including Armatimonadetes = 0.08%, TM7 = 0.05%, 
and WS2 = 0.03%). Details on the top 511 dominant phylotypes are shown 
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of phenotypic information that we have avail- 
able for these dominant phylotypes. Not sur- 
prisingly, phylotypes closely related to previously 
cultivated taxa tended to come from a few well- 
studied taxonomic groups, mostly Proteobacteria 


and Actinobacteria, with only a few representa- 
tives available from other phyla (Figs. 1C and 2B 
and table S1), highlighting the well-known taxo- 
nomic biases of many preexisting culture collec- 
tions (6). 
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Fig. 2. Phylogenetic tree including the taxonomic information on dominant soil bacterial 
phylotypes. (A) Histogram showing the percentage 16S rRNA gene sequence similarity between 
the 511 dominant phylotypes and the most closely related available reference genome for each 
phylotype. (B) Phylogenetic distribution of the 511 dominant phylotypes. Black shading on the 
innermost and middle rings indicates, for each phylotype, whether there is a representative isolate 
and a genome match at the =97% 16S rRNA gene sequence similarity level. The coloring on the 
outermost ring highlights the distribution of environmental preferences for all phylotypes (n = 511). 
For the few phylotypes where taxonomic assignment did not correspond to tree topology, no manual 
corrections were made. Betaproteo., Betaproteobacteria; Alphaproteo., Alphaproteobacteria; 
Deltaproteo., Deltaproteobacteria; Plancto., Planctomycetes; Firmic., Firmicutes. 
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After identifying the dominant 511 phylotypes, 
we used random forest modeling (24) to iden- 
tify habitat preferences for each phylotype (20). 
Our statistical models included 15 environmen- 
tal factors: climate (aridity index, minimum and 
maximum temperature, precipitation seasonal- 
ity, and mean diurnal temperature range), ultra- 
violet (UV) radiation, net primary productivity, 
soil abiotic properties (soil texture; pH; total C, 
N, and P concentrations; and C:N ratio), and dom- 
inant ecosystem type (forests and grasslands) 
(20). We found that 53% (270) of the dominant 
511 phylotypes had predictable habitat prefer- 
ences [models explaining >30% of the variation; 
see (20) and table S1], with soil pH, climatic fac- 
tors (aridity index, maximum temperature, and 
precipitation seasonality), and plant productivity 
consistently being the best predictors of their 
abundances across the globe (fig. S6). These find- 
ings are in line with previous research demon- 
strating that climatic factors and soil pH are 
often highly correlated with observed differences 
in overall soil bacterial community composition 
(4, 8, 9, 15, 16), but additionally, we found a strong 
link between microbial community composition 
and plant productivity (fig. S7). We were unable 
to identify a strong ecological preference for the 
remaining 241 of the 511 phylotypes, which in- 
cluded representatives from a wide range of phyla 
and subphyla (fig. S8). Our inability to predict the 
distributions of these 241 phylotypes could be re- 
lated to the absence of key, but hard to measure, 
environmental predictors (e.g., soil C availability) 
or the fact that our models did not take into ac- 
count specific associations between the bacteria 
and plants, fungi, or animals (e.g., pathogen- 
host or predator-prey interactions), which may 
be driving their distribution patterns. Alterna- 
tively, we may not have been able to identify the 
habitat preferences of these phylotypes because 
of low variability in their abundances across the 
samples (figs. S9 and S10). Indeed, the relative 
abundance of the group including all 241 un- 
determined phylotypes showed a much lower 
coefficient of variation than the relative abun- 
dance of those phylotypes for which we could 
identify their habitat preferences, as explained 
below (fig. S9). This result suggests that the un- 
determined phylotypes, those with no clearly 
identifiable habitat preferences, represent a “core” 
group of dominant phylotypes that are ubiquitous 
across global soils with proportional abundances 
that are relatively invariant. 

We then used semipartial correlations (Spearman) 
and clustering analyses (20) to identify groups of 
phylotypes with shared habitat preferences, re- 
stricting our analyses to those 270 phylotypes 
with predictable distribution patterns. We found 
that the phylotypes group into five reasonably 
well-defined ecological clusters sharing environ- 
mental preferences for (i) high pH; (ii) low pH; 
(iii) drylands; (iv) low plant productivity; and 
(v) dry-forest environments (Figs. 2B and 3A, fig. 
S11, and table S1). These five clusters of phylo- 
types included 200 out of the 270 phylotypes for 
which we could identify their habitat prefer- 
ences (table S1). Each of the ecological clusters 
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Fig. 3. Identified habitat preferences for dominant soil bacterial 
phylotypes. (A) Relationships between the relative abundance 

of the phylotypes assigned to each ecological cluster and their 

major environmental predictors (statistical analyses and identity of 
phylotypes within each cluster are presented in table S1). (B) Network 


identified included phylotypes from multiple 
phyla, suggesting that habitat preferences are not 
linked to phylogeny at coarse levels of resolution 
(fig. S8). The remaining 70 phylotypes were clas- 
sified into three minor clusters, including a small 
cluster consisting of six phylotypes (high pH-forest 
preference; table S1 and fig. S11) and two clusters 
that included phylotypes with preferences includ- 
ing warm-forests, sites with low seasonal varia- 
tion in precipitation, mesic environments, and 
soils of low phosphorus content (table S1 and 
fig. S11). These results suggest that the dominant 
bacterial phylotypes can be clustered into predic- 
table ecological groups that share similar habitat 
preferences. To cross-validate the ecological clus- 
ters, we used correlation network analyses (20, 25) 
to investigate whether bacterial phylotypes shar- 
ing similar habitat and environmental prefer- 
ences tend to co-occur (Fig. 3B). Indeed, our network 
analyses indicated that bacterial phylotypes shar- 
ing a particular habitat preference (e.g., low pH) 
tend to co-occur with other phylotypes belonging 
to the same cluster more than we would expect 
by chance (P < 0.001 for all clusters; Fig. 3B and 
fig. S12). 

We next sought to determine if we could iden- 
tify genomic attributes that delineate bacteria as- 
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in fig. S12). 


signed to the individual ecological clusters. These 
analyses were restricted to the relatively small 
subset of bacterial phylotypes for which genomic 
data were available (>97% 16S rRNA sequence 
similarity to a reference genome). An insufficient 
number of representative unique genomes were 
available from phylotypes in four of the five major 
clusters identified (fig. S13). However, we had ge- 
nomic data for 10 unique genomes out of 25 
phylotypes assigned to the “drylands” cluster, 
including representatives of the Proteobacteria 
and Actinobacteria phyla (fig. S13). We then iden- 
tified functional genes that were overrepresented 
in this “drylands” cluster as compared to the ge- 
nomes available for the other dominant taxa. A 
total of 72 genomes were included in this anal- 
ysis, with 10 of these genomes belonging to the 
dryland cluster (20). We found that the genomes 
within this dryland cluster had significantly higher 
relative abundances of 18 genes (fig. S14) com- 
pared to genomes representative of phylotypes 
assigned to other ecological clusters. Notably, 
Mnh and Mrp genes, which encode membrane 
transport proteins responsible for the proton- 
mediated efflux of monovalent cations (e.g., Na’, 
K+), were overrepresented in the “drylands” clus- 
ter (fig. S14). These genes have frequently been 
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diagram with nodes (bacterial phylotypes) colored by each of 

the five major ecological clusters that were identified, highlighting 
that the phylotypes within each ecological cluster tend to co-occur 
more than expected by chance (statistical analyses presented 


linked to increased bacterial tolerance to alkaline 
or saline conditions and, more generally, a greater 
capacity to tolerate external changes in the os- 
motic environment (26). These adaptations are 
likely to be important for bacteria living in arid 
soils, which are often saline, have high pH values, 
and experience prolonged periods of low mois- 
ture availability (27). Given the low number of 
reference genomes available, these findings are 
not conclusive and are simply a “proof of con- 
cept.” Nevertheless, our results highlight that it 
is possible to identify genomic attributes that 
differentiate soil bacteria with distinct environ- 
mental preferences. They also emphasize the im- 
portance of acquiring new genomes to further 
understand the ecological attributes of dominant 
soil bacterial taxa. As such, our results pave the 
way for leveraging genomic data to predict the 
spatial distributions of soil bacterial taxa, efforts 
that will be improved as the collections of ref- 
erence genomes from these microorganisms in- 
crease in size. 

Together, our results suggest that there are 
predictable clusters of co-occurring dominant 
bacterial phylotypes in soils from across the 
globe. This finding indicates that commonly 
available environmental information could be 
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Fig. 4. A global atlas of the dominant bacteria found in soil. 

(A to D) Predicted global distribution of the relative abundances of the 
four major ecological clusters of bacterial phylotypes sharing habitat 
preferences for high pH, low pH, drylands, and low plant productivity. 
R? (percentage of variation explained by the models) as follows: 


used to build predictive maps of the global dis- 
tributions of these bacterial clusters at a global 
scale. We did so for the four major ecological 
clusters (i.e., low pH, high pH, drylands, and low 
productivity, Fig. 4) (20) using the prediction- 
oriented regression model Cubist (28) and in- 
formation on 12 environmental variables for 
which we could acquire globally distributed in- 
formation (20). Our models confirm that pH, 
aridity levels, and net primary productivity are 
major drivers of the low-pH, high-pH, dryland, 
and low-productivity clusters observed, respec- 
tively (Appendix S1). Notably, our maps (which 
accounted for 36 to 64% of the spatial variation 
in these clusters, Fig. 4) provide estimates of the 
regions where we would expect the groups of 
dominant soil bacterial phylotypes to be most 
abundant (Fig. 4). As expected, the dryland and 
low-productivity clusters were relatively abun- 
dant in dryland and low-productivity regions 
across the globe, and the low- and high-pH 
clusters were particularly abundant in areas 
known for their low- or high-pH soils, respectively. 

This global inventory of dominant soil bacte- 
rial phylotypes represents a small subset of phylo- 
types that account for almost half of the 16S rRNA 
sequences recovered from soils. We show that we 
can predict the environmental preferences for 
more than half of these dominant phylotypes, 
making it possible to predict how future envi- 
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ronmental change will affect the spatial distribu- 
tion of these taxa. Following Grime’s mass ratio 
hypothesis (10), we would expect that identify- 
ing the physiological attributes of these dom- 
inant taxa will be critical for improving our 
understanding of the microbial controls on some 
key soil processes, including those that regulate 
soil C and nutrient cycling (7-3, 29). Also, given 
the strong links between the distribution of bac- 
terial phylotypes and their functional attributes 
across the globe (8, 12), and the observed asso- 
ciations between dominant and subdominant 
phylotypes (fig. S5), we expect that these domi- 
nant bacteria will be critical drivers, or indica- 
tors, of key soil processes worldwide. We also 
found that habitat preferences were not predict- 
able from phylum-level identity alone, given that 
all of the ecological clusters included phylotypes 
from multiple phyla. This suggests that phylotypes 
from diverse taxa share some phenotypic traits 
(e.g., osmoregulatory capabilities) or life-history 
strategies (29, 30) that allow them to survive 
under particular environmental conditions. By 
narrowing down the number of phylotypes to 
be targeted in future studies from tens of thou- 
sands to a few hundred, our study paves the 
way for a more predictive understanding of soil 
bacterial communities, which is critical for accu- 
rately forecasting the ecological consequences of 
ongoing global environmental change. 
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-150 


(i) high-pH cluster, R? = 0.53, P < 0.001; (ii) low-pH cluster, R* = 0.36, 
P < 0.001; (iii) drylands cluster, R? = 0.64, P < 0.001; and (iv) low- 
productivity cluster, R = 0.40, P < 0.001. The scale bar represents the 
standardized abundance (z-score) of each ecological cluster. An 
independent cross-validation for these maps is available in (20). 
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Improving refugee integration 
through data-driven 
algorithmic assignment 


Kirk Bansak,’?* Jeremy Ferwerda,”?* Jens Hainmueller,’”**+ Andrea Dillon,” 
Dominik Hangartner,”’”* Duncan Lawrence,” Jeremy Weinstein’” 


Developed democracies are settling an increased number of refugees, many of whom face 
challenges integrating into host societies. We developed a flexible data-driven algorithm that 
assigns refugees across resettlement locations to improve integration outcomes. The algorithm 
uses a combination of supervised machine learning and optimal matching to discover and 
leverage synergies between refugee characteristics and resettlement sites. The algorithm was 
tested on historical registry data from two countries with different assignment regimes and 
refugee populations, the United States and Switzerland. Our approach led to gains of roughly 40 
to 70%, on average, in refugees’ employment outcomes relative to current assignment 
practices. This approach can provide governments with a practical and cost-efficient policy tool 
that can be immediately implemented within existing institutional structures. 


efugees are among the world’s most vul- 
nerable populations (7, 2). After experienc- 

ing war, violence, and years of living in 
overcrowded refugee camps, refugees arrive 

in a new country with few resources and 

must acclimate to an unfamiliar local language, 
economy, and culture. Refugees frequently remain 
economically marginalized, with low levels of em- 
ployment in the years following their arrival (3-5). 
The assignment of refugees to different reset- 
tlement locations within a host country is one of 
the first policy decisions made during the re- 
settlement process (6). It is also one of the most 
consequential in maximizing refugees’ economic 
integration and self-sufficiency as a first step to- 
ward a more comprehensive integration into so- 
ciety (7-9). Three sets of factors affect refugee 
integration: geographical context, personal char- 
acteristics, and synergies between geography and 
personal characteristics (Fig. 1 and fig. $1). For 
instance, some resettlement locations in the United 
States offer better economic and social oppor- 
tunities that can result in higher levels of refugee 
employment (Fig. 1A). In addition, refugees with 
certain characteristics, such as language and edu- 
cational skills, are more likely to succeed econom- 
ically regardless of the resettlement location to 
which they are sent (Fig. 1B). Finally, the expected 
employment returns associated with personal 
characteristics can vary across different resettle- 
ment locations (Fig. 1C). This indicates that there 
are synergies between places and people; certain 
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characteristics will make a refugee a better match 
for a particular location. In Switzerland, for exam- 
ple, we find that the ability to speak French (i.e., 
among French-speaking African refugees) results 
in a larger payoff for refugees assigned to French- 
speaking cantons than for those assigned to 
German-speaking cantons (fig. $2). 

Host countries’ current procedures for deter- 
mining how to allocate refugees across domestic 
resettlement sites do not fully leverage synergies 
between refugees and geographic locations. For 
instance, in the United States, refugees without 
existing U.S. ties are primarily assigned to re- 
settlement locations according to the capacity of 
local resettlement offices at the time of arrival, 
without a systematic assessment of the local em- 
ployment rate for refugees of similar profiles. In 
Switzerland, where most refugees initially enter 
as asylum seekers, the federal government at- 
tempts to reduce fiscal and social strain on in- 
dividual localities by making assignment random 
and proportional across regions. 

Prior research has proposed different schemes 
for refugee assignment both across countries 
(10, 11) and within countries (12, 13). These 
proposals include two-sided matching markets 
in which an optimized assignment is determined 
on the basis of match efficiency and/or the pre- 
ferences of refugees and host locations (74). Al- 
though these approaches are theoretically appealing, 
there are practical barriers to their implementa- 
tion, including a lack of systematic data on refugee 
preferences and the need for extensive political 
coordination. 

We have developed a data-driven approach that, 
in contrast, can be immediately implemented by 
using existing data to optimize integration out- 
comes. Our algorithm has three stages: modeling, 
mapping, and matching. The modeling stage in- 
volves a supervised machine learning process that 
predicts the expected success for any quantifiable 
metric—for example, early employment—of new 
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refugee arrivals across all possible resettlement 
locations. We designated historical resettlement 
data for model training, in which the unit of ob- 
servation was a single refugee and which con- 
tained information on the refugees’ background 
characteristics (e.g., country of origin, language 
skills, gender, age, etc.), time of arrival, assigned 
location, and measured employment success. These 
training data were then used to build a bundle of 
supervised learning models that predicted refu- 
gees’ expected employment success as a function 
of their background characteristics. A separate 
model was fit for subgroups of refugees assigned 
to each location, thus yielding different models 
for each location and allowing for the discovery 
of refugee/location synergies. These fitted models 
were then applied to new, out-of-sample refugee 
arrival data to predict the expected employment 
success of each new arrival at each possible re- 
settlement location. 

The mapping stage involves transforming the 
refugee-level predictions from the modeling stage 
to a case-level metric. Mapping to a case-level me- 
tric is necessary because refugees are often not 
assigned to locations on an individual basis, but 
rather on a case-level basis, with cases most often 
being family units. Various mapping functions 
can be used. Our preferred case-level metric was 
the predicted probability that at least one refu- 
gee in the case would find employment at the 
location in question. This metric uses a simpli- 
fying assumption that the probabilities of em- 
ployment for refugees within a case are inde- 
pendent, although we also tested alternative 
mapping functions—namely the mean, maximum, 
and minimum predicted probability of employ- 
ment within each case—that do not require this 
assumption (15). 

Finally, the matching stage involves assigning 
each case to a specific location to fulfill a chosen 
optimality criterion subject to constraints. Our 
algorithm is flexible and can accommodate mul- 
tiple criteria and constraints. The optimality 
criterion we used in our applications was to 
maximize the average of the case-level metric 
(i.e., the global average of the probability that 
at least one refugee in each family gains em- 
ployment). We also imposed constraints that 
represent real-world assignment restrictions, 
such as how many cases can be sent to different 
locations. To solve this constrained optimiza- 
tion problem, we used an optimal matching 
procedure with the RELAX-IV minimum cost 
flow solver (16, 17); see supplementary mate- 
rials and figs. S3 to S5 for details of the al- 
gorithm, data, measures, and statistical analysis 
(including out-of-sample classification accuracy 
and probability calibration). 

For the algorithm to obtain reliable predictions, 
it is important that the historical assignment 
process not be determined by unobserved refugee 
characteristics. This criterion is currently met in 
many countries that assign refugees either ran- 
domly (according to burden-sharing constraints) 
or according to premeasured refugee character- 
istics that would serve as feature inputs into the 
algorithm. We assessed the performance of the 
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algorithm through applications in two such coun- 
tries: the United States, where refugees are assigned 
primarily on the basis of capacity constraints, 
and Switzerland, where refugees are assigned 
randomly according to a proportional distribution 
key (see supplementary materials and tables S1 
and S2 for details). 

In the United States, reception and placement 
services (e.g., arranging location assignments, 


housing, etc.) for refugees are implemented by 
nine voluntary agencies in cooperation with the 
Department of State. After refugees are allocated 
to one of the agencies, placement officers cen- 
trally assign refugees to the agency’s resettlement 
locations subject to local capacity constraints (18). 
Placement officers make assignment decisions 
prior to refugees’ arrival and without interviewing 
the refugees. The premeasured characteristics of a 


case available to the placement officers can be 
viewed in the data, and hence can be used as 
feature inputs into the algorithm. 

Refugees are granted work authorization upon 
arrival and encouraged to find employment as 
soon as possible. To track refugee resettlement 
success, the agencies are required to report the 
refugees’ employment status at the end of the 
reception and placement period, 90 days after 
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Fig. 1. Variation in refugee employment in the United States. (A to C) 
Refugee employment at 90 days after arrival varies as a function of 
refugees’ assigned resettlement location (A), personal characteristics 
(pooled across refugees assigned to all locations) (B), and synergies 
between characteristics and locations (two example locations) (C). In (B) 
and (C), dots with horizontal lines indicate point estimates with robust 


Bansak et al., Science 359, 325-329 (2018) 19 January 2018 


95% confidence intervals from ordinary least-squares regression. The 
open circles on the zero line denote reference categories. The data 

for all three panels include working-age refugees resettled by one of 
the largest U.S. resettlement agencies during the 2011-2016 period 

(n = 33,782). These results are replicated for only working-age refugees 
without U.S. ties (i.e., “free cases”) in fig. S1. 
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arrival. To assess whether an optimized assign- 
ment could improve refugee outcomes, we anal- 
yzed de-identified data from one of the largest 
resettlement agencies for working-age refugees 
(ages 18 to 64; m = 33,782) resettled during the 
2011-2016 period. We split the data into training 
and test sets. For model training, we used data 
for the refugees who arrived from 2011 up to (but 
not including) the third quarter (Q3) of 2016, the 
most recent quarter with available data. We then 
applied the fitted models to predict the expected 
employment success at each location and determine 
the optimal assignment for the test set, refugees 
who arrived in 2016 Q3. For the test data, we focused 
on refugees who were free to be assigned to different 
resettlement locations (n = 919), in contrast to 
refugees who are assigned according to the loca- 
tion of family or other ties. We also imposed con- 
straints on the assignment such that each location 
could only receive as many cases under the opti- 
mized assignment as were received in actuality. 

Our algorithmic assignment considerably in- 
creased expected refugee employment over the 
status quo assignment (Fig. 2). The median re- 
fugee’s predicted probability of employment in 
the United States more than doubled, increasing 
from approximately 25% to 50%. Our optimized 
assignment increased the probability of finding 
employment across the entire distribution of re- 
fugees, including those who were least likely and 
most likely to find work. 

In addition, the algorithmic assignment yielded 
higher employment rates in almost every location, 
including locations that had higher and lower 
baseline employment rates. On average, the employ- 
ment rate was 34% under the actual assignment 
and 48% under the optimized assignment, which 
means that the optimized assignment would in- 
crease the employment rate above the baseline 
by roughly 41%. 

We conducted a second test in the context of 
Switzerland, whose asylum process is similar to 
that of other European countries belonging to the 
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Common European Asylum System. In Switzerland, 
asylum seekers who are not immediately rejected 
upon arrival are assigned to one of 26 cantons, 
where they wait for a decision on their asylum 
application. We focused on asylum seekers who 
received subsidiary protection status, which is 
Switzerland’s largest refugee category (See sup- 
plementary materials). We drew upon data from 
the Swiss State Secretariat for Migration (SEM), 
which centrally manages the asylum process and 
assignment. In contrast to the U.S. case, the SEM 
uses a proportional random assignment of cases 
to locations, and tracks employment outcomes 
for several years after asylum seekers’ arrival. This 
allowed us to benchmark our algorithm against a 
different status quo assignment mechanism and to 
optimize for a longer-term employment metric— 
specifically, refugees’ employment at the end of 
their third calendar year in Switzerland. We focused 
on all working-age refugees who received subsidiary 
protection status and arrived from 1999 to 2013 
(n = 22,159), with refugees arriving in 2013 who 
were free to be assigned to any canton as the test 
set (n = 888) and refugees arriving in all prior 
years as the training set. We also imposed the 
constraint that each canton gets assigned the same 
number of cases as in actuality, in which the num- 
ber of cases, by law, is assigned in proportion to 
the population of the canton. 

Our algorithmic assignment considerably in- 
creased expected refugee employment over the 
status quo assignment (Fig. 3). Similar to the U.S. 
context, our algorithm increased the predicted 
probability of finding employment across the en- 
tire distribution of refugees. On average, the third- 
year employment rate was 15% under the actual 
assignment and 26% under the optimized assign- 
ment. These results suggest that the data-driven 
assignment has the potential to increase third-year 
employment in the Swiss context by about 73%. 

In the supplement, we present further results 
for both countries in which we applied alter- 


native specifications for the algorithm. Specifically, 
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we replicated testing with different time periods 
(figs. S6 and S7), alternative mapping functions 
(fig. S8), shorter- and longer-term outcomes (fig. S9), 
and varying lengths of the training data period 
(fig. S10). The results from these tests all show 
considerable gains. 

Our analysis demonstrated large potential im- 
provements, but we did not test the algorithm 
prospectively. Ideally, it should be tested in a 
randomized controlled trial design. In addition, 
further research is needed to determine whether 
it is more effective to optimize for short-term or 
long-term outcomes. In Switzerland, for example, 
we find considerable gains regardless of whether 
we optimize for second-, third-, or fourth-year em- 
ployment (see supplementary materials). In the 
United States, however, longer-term employment 
outcomes are currently not tracked. Still, early 
employment is often highly predictive of long- 
term employment (19), and the use of shorter-term 
outcomes in the algorithm allows for faster learn- 
ing of emerging and declining synergies based 
on more recent data, possibly resulting in a more 
effective assignment. 

In contrast to more expensive interventions 
(such as language or job training programs) that 
are sometimes implemented long after refugees’ 
arrival, our approach is cost-efficient and imple- 
mented before refugees’ arrival, giving them the 
strongest foundation possible from which to in- 
tegrate into host societies. Furthermore, our ap- 
proach modifies an existing policy process, 
facilitating its immediate implementation, and 
it is dynamic in that it adapts to synergies over 
time. Because of the algorithm’s data-driven learn- 
ing capacity, policy-makers do not need to invest 
in identifying the precise sources of those synergies— 
local economic conditions, social environments, 
resettlement office efficacy, etc.—to harness their 
benefits. 

Our approach also preserves the ability of policy- 
makers to set their own parameters and prior- 
ities. Specifically, policy-makers can choose their 
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Fig. 2. Employment gains from data-driven refugee assignment in the United States. (A) Empirical cumulative distribution functions (ECDFs) 
of the refugees’ predicted 90-day employment probabilities under their actual and algorithmic assignments. (B) Actual and algorithmic employment 


rates by resettlement location. 
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Fig. 3. Employment gains from data-driven refugee assignment in Switzerland. (A) ECDFs of the refugees’ predicted third-year 
employment probabilities under their actual and algorithmic assignments. (B) Actual and algorithmic employment rates by canton. 
See table S3 for canton names. 
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Invertebrates rely on Dicer to cleave viral double-stranded RNA (dsRNA), and Drosophila Dicer-2 
distinguishes dsRNA substrates by their termini. Blunt termini promote processive cleavage, while 3’ 
overhanging termini are cleaved distributively. To understand this discrimination, we used cryo-—electron 
microscopy to solve structures of Drosophila Dicer-2 alone and in complex with blunt dsRNA. While the 
Platform-PAZ domains have been considered the only Dicer domains that bind dsRNA termini, 
unexpectedly, we found that the helicase domain is required for binding blunt, but not 3’ overhanging, 
termini. We further showed that blunt dsRNA is locally unwound and threaded through the helicase 
domain in an ATP-dependent manner. Our studies reveal a previously unrecognized mechanism for 
optimizing antiviral defense and set the stage for discovery of helicase-dependent functions in other 


Dicers. 


Dicer ribonucleases cleave double-stranded RNA (dsRNA) 
precursors to generate microRNAs (miRNAs) and small in- 
terfering RNAs (siRNAs) (J, 2). In concert with Argonautes, 
these small RNAs bind complementary mRNAs to down-reg- 
ulate their expression. miRNAs are processed by Dicer from 
small hairpins, while siRNAs are typically processed from 
longer dsRNA, from endogenous sources (3), or exogenous 
sources such as viral replication intermediates (4-6). Some 
organisms, such as Homo sapiens and Caenorhabditis ele- 
gans, encode one Dicer that generates miRNAs and siRNAs, 
but other organisms have multiple Dicers with specialized 
functions. 

Dicers exist throughout eukaryotes, and a subset have an 
N-terminal helicase domain of the RIG-I-like receptor (RLR) 
subgroup (7) (Fig. 1A and fig. SIA). RLRs often function in 
innate immunity (8), and Dicer helicase domains sometimes 
show differences in activity that correlate with roles in im- 
munity. For example, Drosophila melanogaster expresses two 
Dicers, one specialized for processing miRNAs (dmDcr-1), 
and a second for siRNAs (dmDcr-2) (9). dmDcr-1 has a degen- 
erate helicase domain and is an ATP-independent enzyme 
(10), while dmDcr-2, with dedicated antiviral roles (11-13), 
has a conserved helicase domain that hydrolyzes ATP (14-17). 
Under certain conditions Homo sapiens Dicer-1 (hsDcr-1) also 
generates viral siRNAs (/8, 19). However, despite conserva- 
tion of its helicase domain, hsDcr-1 does not hydrolyze ATP 
in vitro (20), and its helicase domain is not implicated in viral 
siRNA biogenesis in vivo (19). Differences in activities of the 
helicase domain of vertebrate and invertebrate Dicers may 
reflect distinct roles in antiviral defense. 

dmDcr-2 activity depends on termini of its dsRNA sub- 
strates (15, 16). Blunt (BLT) termini promote a processive re- 
action whereby multiple siRNAs are produced before dmDcr- 
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2 dissociates, and this reaction requires a functional helicase 
domain and ATP (5, 17). In contrast, dsRNAs with 3' over- 
hanging (3' ovr) termini promote an ATP-independent, dis- 
tributive cleavage, whereby dmDcr-2 dissociates after each 
cleavage. hsDcr-1 does not require ATP for processing BLT or 
3' ovr termini (fig. S1B, C and D, lanes 5, 7, 10, 12), suggesting 
that, at least in vitro, cleavage of BLT dsRNA is not dependent 
on its helicase domain. 

To understand the mechanism of termini discrimination 
by dmDcr-2, we used cryo-electron microscopy (cryo-EM) to 
determine structures of dmDcr-2 alone and in complex with 
a BLT 52 base pair (bp) dsRNA (52 dsRNA) and ATP-YS (Fig. 
1, figs. S2 to S10, and table S1). We used full-length dmDcr-2 
with a point mutation in each RNase III domain to preclude 
dsRNA cleavage (dmDcr-2"") (Fig. 1A). ATP hydrolysis is re- 
quired for processive cleavage of BLT dsRNA, and dmDcr-2 
cannot hydrolyze ATP-yS, which stabilizes a helicase-depend- 
ent conformation of dmDcr-2 (16). 

The structure of apo-dmDcr-2®" (Fig. 1, B and C, and figs. 
82 to S5) reiterated the “L shape” of lower resolution (~15-30 
A) EM reconstructions of hsDer-1 (27-23). Our 7.1 A EM den- 
sity map (Fig. 1B and fig. S2C) enabled fitting (fig. S4, B to F) 
and homology modeling (Fig. 1B) of Platform-PAZ domains 
at the Cap and tandem RNase III domains in the Core. An 
additional round of 3D classification (fig. S3) revealed an 8.7 
A map (fig. S2C) allowing assignment of the helicase domain 
at the base (Fig. 1C and fig. S4A). Fitting of related apo-hel- 
icases into the EM density is consistent with the helicase do- 
main adopting an open conformation (fig. $5, A to D). 

The 2D class averages of the dmDcr-2 complex revealed 
protein with well-resolved secondary structure features 
bound to the BLT dsRNA terminus (fig. $6, B and C). Some 
protein density was missing, and since control experiments 
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indicated protein on the grid was intact (fig. S7A), this was 
likely due to inherent flexibility (fig. S7B). Measurements of 
the dsRNA, guided by major grooves, showed visible protein 
footprinted ~8-9 bps (fig. S6B). The crystal structure of RIG- 
I’s helicase domain bound to dsRNA has a similar footprint 
(24, 25), suggesting dmDcr-2’s helicase domain bound to BLT 
dsRNA termini. Indeed, our 6.8 A reconstruction of the com- 
plex (figs. S6E and S8) resembled RIG-I in a closed confor- 
mation (Fig. 1D and fig. $9). The Hell and Hel2 subdomains 
of RIG-I’s helicase, along with the pincer helices (fig. SIA) 
could be fit as a single rigid body (fig. S9A). The reconstruc- 
tion also revealed a helical bundle characteristic of the Hel2i 
subdomain that could be fitted separately as a rigid body (fig. 
S9B). These fittings enabled a homology model of dmDcr-2’s 
helicase domain bound to BLT dsRNA (Fig. 1D and fig. S9, D 
and E). 

Our models of dmDcr-2’s helicase in open (apo) and 
closed (substrate-bound) conformations implied clamping of 
the helicase on BLT dsRNA termini (Fig. IE and movies S1 
and S2). In the open conformation, Hel2 and Hel2i extend 
away from Hell creating a C-shaped opening for substrate en- 
gagement (Fig. 1C). In the BLT dsRNA-bound state, Hel2 and 
Hel2i swivel toward Hell to clamp on the terminus (Fig. 1, D 
and E, and movies S1 and 82). 

Within the helicase domain, density was only observed for 
one RNA strand, indicative of local unwinding (Fig. 1F and 
fig. S9C). Unwinding would likely require ATP hydrolysis, 
and possibly was enabled by contaminating ATP in commer- 
cial preparations of ATP-yS (16). Using dsRNAs with a nick 
in sense or antisense strands (fig. SI1A), we performed in 
vitro unwinding assays (fig. SIIB). With ATP, dmDer-22", but 
not the ATPase-defective Walker A mutant dmDcr-2""%344 
(16), unwound BLT dsRNA termini (fig. S11B; compare lanes 
3, 7, top and bottom panels). 

The unwound single strand maintained an A-form confor- 
mation (Fig. 1F), likely to minimize entropic costs of rean- 
nealing before cleavage in RNase III sites. Whether the RIG- 
I helicase unwinds dsRNA is controversial (26, 27), but re- 
lated helicases exhibit unwinding activity (28). Local unwind- 
ing may facilitate dmDcr-2’s helicase domain in binding and 
translocating along dsRNA. 

The Platform-PAZ domains have been considered the only 
Dicer domains that bind dsRNA termini (29-32), but our 
structures suggested the helicase domain also binds termini. 
To test this idea, we purified dmDcr-24")8™ (Fig. 2A), which 
lacked the helicase domain (Fig. 1A). Consistent with previ- 
ous studies (16), in gel-shift assays full-length dmDcr-2"™ 
bound both BLT and 3' ovr dsRNA (Fig. 2B, dsRNA design; 
and Fig. 2C, top); ATP increased affinity for BLT dsRNA and 
decreased affinity for 3' ovr dsRNA (Fig. 2, C and D and table 
S2). However, while dmDcr-24#*:8= pound 3’ ovr dsRNA with 
an affinity similar to dmDcr-2®™, its binding to BLT dsRNA 
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was not detected (Fig. 2C, bottom panel, Fig. 2D, and table 
S2). The inability of dmDcr-24"*!22 to bind BLT dsRNA was 
not due to the absence of ATP hydrolysis because dmDcr- 
QRLK34A hound BLT dsRNA efficiently (fig. S12, A and B, and 
table S2). Thus, the helicase domain is required for binding 
BLT, but not 3' ovr, dsRNA. 

Single-turnover cleavage assays showed that neither 
dmDcr-2"7 nor dmDcr-24#* (fig. S12C) cleaved BLT dsRNA 
without ATP (Fig. 2, E and F, lanes 5 and 6). With ATP, cleav- 
age of BLT dsRNA by dmDcr-2™" gave heterogeneous cleav- 
age products characteristic of Dicer enzymes with ATPase 
activity (16, 33), but strikingly, dmDcr-24""' was incapable of 
cleaving BLT dsRNA (Fig. 2, E and F; compare lanes 9 and 
10). As expected (15, 16), cleavage of 3' ovr dsRNA was inde- 
pendent of ATP, and with both dmDcr-2"" and dmDcr-24#* 
produced a single siRNA-sized 22 nt band (Fig. 2E; compare 
lanes 7, 8, 11, 12). dmDcr-24#" cleaved 3' ovr dsRNA more ef- 
ficiently than dmDcr-2™" (Fig. 2, E and F, compare lanes 7, 8, 
11, 12), suggesting the helicase domain hinders cleavage of 
3' ovr dsRNA. This observation is reminiscent of autoinhibi- 
tion by hsDcr-1’s helicase domain in processing 3' ovr dsR- 
NAs (34). 

Our biochemical and structural studies indicated that 
dmDcr-2 has two modes of substrate recognition and cleav- 
age: one mediated by Platform-PAZ domains and resulting in 
precise cleavage of 3' ovr dSRNAs into 22mer siRNAs, and a 
second mediated by the helicase domain and resulting in het- 
erogeneous cleavage of BLT dsRNAs. We searched for amino 
acids that might separately affect cleavage of a 3' ovr or BLT 
dsRNA. We created one variant of dmDcr-2 (Fig. 1A, PP) con- 
taining five point mutations in the Platform and PAZ do- 
mains (35). Multiple crystal structures show a phenylalanine 
in the C-terminal domain (CTD) of RIG-I recognizes BLT 
dsRNA by stacking on the terminal base pair (24). Dicer en- 
zymes do not have a CTD, but for the second variant we 
searched for regions in dmDcr-2 with sequence similarity to 
the CTD. Within the region identified (fig. S13, A to C) we 
mutated a single phenylalanine to a glycine (dmDcr-2?”°), 

We compared activities of purified dmDcr-2” and dmDcr- 
256 (fig. S13D) to dmDcr-2"7 using single-turnover cleavage 
assays (Fig. 3, A and B). As expected, cleavage of BLT dsRNA 
was not observed without ATP (Fig. 3, A and B, lanes 5 to 7). 
However, with ATP, cleavage of BLT dsRNA by dmDcr-2™' or 
dmDcr-2”? appeared nearly identical (Fig. 3, A and B, lanes 11 
and 13), while cleavage was completely disrupted by the hel- 
icase point mutation in dmDcr-2”~¢ (Fig. 3, A and B, lane 12). 
[At least part of this effect is due to weakened BLT dsRNA 
binding (fig. S13, E and F, and table $2)]. By contrast, cleavage 
of 3' ovr dsRNA was independent of ATP and minimally af- 
fected by the F225G helicase mutation (Fig. 3, A and B, lanes 
8, 9, 14, 15). However, cleavage was eliminated by mutations 
in the Platform-PAZ domains (Fig. 3, A and B, lanes 10 and 
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16). These data reiterate that cleavage of 3' ovr dSRNA is me- 
diated by Platform-PAZ domains, while the helicase domain 
coordinates recognition and cleavage of BLT dsRNA. 

While dmDcr-2’? cleaved BLT dsRNA to yield a pattern 
nearly identical to dmDcr-2", levels of 22 nt siRNA de- 
creased (Fig. 3, A and B, lanes 11 and 13). Since 22 nt siRNA 
was not observed with dmDcr-2"?¢ (Fig. 3A, lane 12), we hy- 
pothesized this species derived from dsRNA that was 
threaded through the helicase domain until the BLT terminus 
encountered the Platform-PAZ domains. To confirm that 
smaller products (<22 nt) did not result from degradation of 
22 nt siRNA, we monitored cleavage of chimeric dsRNAs con- 
taining deoxynucleotides at positions 21-23 from the 5' ter- 
minus (Fig. 3C). Cleavage of 3'ovr dsRNA was eliminated 
with chimeric molecules (Fig. 3, D and E; compare lanes 8, 9, 
15, 16), as expected for Platform-PAZ-mediated cleavage. 
However, for BLT dsRNA, while 22 nt siRNA was absent, all 
other fragments were visible (Fig. 3, D and E; compare lanes 
6 and 13), consistent with a helicase-mediated threading 
mechanism. 

Studies of Dicer from other organisms indicate Platform- 
PAZ domains bind termini of 3' ovr dsRNA to allow measur- 
ing to RNase III active sites and production of an siRNA 
length (29, 30). Our cryo-EM structure of the dmDcr-2 com- 
plex and subsequent biochemical studies suggested that BLT 
dsRNA is cleaved differently, and in an ATP-dependent man- 
ner, threaded through the helicase domain to encounter the 
RNase III active sites. We tested the threading model by de- 
signing dsRNA with blocks at specific positions (Fig. 4A). 
Measurements using our EM density maps predicted that 
BLT dsRNAs are threaded through the helicase domain ~20 
bp before encountering RNase III domains. To trap threading 
intermediates, we put biotin-dT analogs on both strands of 
52 BLT dsRNA, at positions 28 or 37, counting from the 5’ 
end of the sense strand (Fig. 4A); there was no significant 
difference in cleavage of these modified dsRNAs (figs. S14, A 
and B). However, we hypothesized that addition of streptavi- 
din to biotin-dT-substituted dsRNAs (Block dsRNAs) would 
arrest threading of dsRNAs through the helicase. When 
dsRNA was incubated with streptavidin before initiating 
cleavage with dmDcr-2™7 and ATP, we trapped early (<11 nt, 
Block-28) and intermediate (11-20 nt, Block-37) threading 
products, without observing 22 nt siRNAs (Fig. 4, B and C; 
see fig. S14C for schematic). By contrast, cleavage by hsDcr-1 
was unaffected by blocks, indicating that, at least under these 
conditions, hsDcr-1 cannot thread dsRNA through its helicase 
domain (Fig. 4D; compare lanes 12-14 to lanes 15-17). 

We anticipated that short threading intermediates (<22 
nts) might be unique to the initial cleavage event. However, 
threading intermediates were observed with dmDcr-2"" un- 
der multiple-turnover conditions using internally **P-labeled 
dsRNAs, increasing proportionally with 22mers through the 
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reaction time course (fig. S14, D and E). Thus, at least in vitro, 
threading intermediates are recurring by-products of proces- 
sive cleavage and not specific to the initial cleavage. dmDcr- 
2’s highly efficient, helicase-dependent, processive cleavage is 
likely advantageous in antiviral defense. The generation of 
heterogeneous cleavage products during processive cleavage 
is predicted to dampen the phasing signal of viral siRNAs and 
is consistent with the overlapping and discontinuous viral 
siRNAs observed in invertebrate cells (6, 13, 36). 

The dsRNA binding protein (dsRBP) Loquacious-PD 
(Loqs-PD) allows dmDcr-2 to cleave independent of termini 
(16, 37) and is required for processing endogenous siRNAs 
(38), but not for an antiviral response (J3). This suggests 
dmDcr-2’s intrinsic termini preferences function in viral de- 
fense, while Loqs-PD allows processing of endogenous 
dsRNA with diverse termini. By monitoring cleavage of dsR- 
NAs with 5' ovr termini, or overhangs on both strands (fig. 
S15, B to E), we determined that dsRNA with an accessible 3' 
terminus is preferentially recognized by the Platform-PAZ do- 
main, and without this feature, is processed by threading 
through the helicase domain. 

RIG-I distinguishes capped termini of cellular transcripts 
from tri- and di-phosphorylated termini of viral transcripts, 
and this is inferred to allow self versus nonself discrimination 
(8). We found that, like RIG-I, dmDcr-2 cannot efficiently pro- 
cess dSRNAs capped at the 5'-terminus, although the phos- 
phorylation state does not affect cleavage (fig. S16, A and B, 
and table S3). These results may reflect dmDcr-2’s ability to 
process precursors of both endogenous and viral siRNAs. 

We show that dmDcr-2 has two modes of cleavage (Fig. 
4E and movies 83 and S4). dmDcr-2 is capable of using its 
Platform-PAZ domain to recognize 3'ovr dsRNAs in vitro, 
but it is unknown if dmDcr-2 processes such dsRNAs in vivo. 
dmDcr-2’s cognate dsRBP, R2D2, may inhibit recognition and 
processing of substrates with 3' ovr termini (77). As such, the 
Platform-PAZ domains of dmDcr-2 may function solely on 
dsRNAs that are threaded through the helicase domain. At 
least in vitro, hsDcr-1 does not distinguish termini and does 
not exhibit helicase-dependent threading. Unlike dmDcr-2, 
hsDer-1 may rely on the Platform-PAZ domain for generating 
viral siRNAs. Indeed, mutations to the Platform-PAZ do- 
mains of hsDcr-1 disrupt viral siRNA biogenesis (19). How- 
ever, given the conservation of hsDcr-1’s helicase domain, it 
is intriguing to consider that, under certain conditions, per- 
haps with additional factors, hsDcr-1 might mediate proces- 
sive cleavage by threading of dsRNA through the helicase 
domain. 
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Fig. 1. Cryo-EM reconstructions of apo-dmDcr-2 and dmDcr-2*BLT dsSRNAsATP-yS. (A) dmDcr-2 domains 
numbered at boundaries. Mutations/deletions in bold, and designated in text as superscripts. All mutations were in 
the context of the full-length protein, unless specified by A. Two full-length variants had multiple mutations: PP 
(H743A, R752A, R759A, R943A, R956A); RII (D1217A, D1476A). (B) Cryo-EM density map of apo-dmDcr-2 (7.1 A) 
fitted with homology models of subdomains. Rilla, RNase Illa; RIllb, RNase IIIb. (C) Homology model of apo-dmDcr- 
2 in open conformation, based on apo-RIG-l, and fitted as rigid body into 8.7 A cryo-EM density map (also figs. S4A 
and $5, B to D). (D) Cryo-EM reconstruction of dmDcr-2°"""*BLT dsRNA*ATP-yS showing helicase in closed, ligand- 
bound conformation. (E) Superimposition of open (light) and closed (dark) helicase conformations showing 
clamping of Hel2 and Hel2i on BLT dsRNA. Arrow, direction of clamping. (F) EM density and modeling of BLT single 
and double-stranded RNA. 
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Fig. 2. dmbDcr-2’s helicase 
domain is required to recognize 
and cleave BLT dsRNA. (A) Gel- 
filtration and SDS-PAGE analyses 
of dmDcr-24"*!" (B) Cartoon of 
dsRNA used in (C-F), showing 
modifications that block binding at 
one end. (C) Gel mobility shift 
assays of dmDcr-2®"" (top) and 
dmDer-244'R"" (bottom) with 52 
BLT or 3'ovr dsRNA, —/+ 5 mM 
ATP (n 2 3). (D) Binding curves 
using data as in (C). Data points, 
mean + SD (n = 3). (E) Single- 
turnover cleavage assays of 52 
BLT or 3'ovr dsRNA (1 nM) with 
dmDer-2“" or dmDcr-244e! (30 
nM), -/7+ 5 mM ATP (n = 3). Only 
initial cleavage is monitored, since 
this removes 5' *“P. Arrow, 22 
nucleotide (nt) SIRNA product. AH, 
alkaline hydrolysis. Left, nt 
lengths. (F) Quantification of 
cleavage assays as in (E). Percent 
dsRNA cleaved (all dsRNA except 
uncleaved) and percent siRNAs 
(21-23 nt products) resulting from 
lst cleavage were quantified. Data 
points, mean + SD (n = 3). *P < 
0.05; **P < 0.01; ***P < 0.001. 
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Fig. 3. Helicase and 
Platform-PAZ domains 
differentially contribute 
to cleavage of BLT and 
3’ovr dsRNA. (A) Single- 
turnover cleavage assays 
of 52 BLT or 3’ovr dsRNA 
(1 nM) with dmDcr-27, 
dmDer-2'*25¢ and dmDcr- 
2°P (30 nM), -/+ 5 mM 
ATP (n = 3). Substrates 
were as described in Fig. 
2,BandE. AH, arrow, as in 
Fig. 2E. (B) Quantification 
of cleavage assays as in 
(A). Data points, mean + 
SD (n = 3). *P < 0.05; **P 
< 0.01; ***P < 0.001; 
****P <Q.0001; P> 0.05, 
n.s. (nonsignificant). 
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and (E), with additional 
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Fig. 4. BLT dsRNA _ threads 
through helicase domain. (A) 
Substrates for (B to D); some 
features described in Fig. 2, B and 
E. Blocked dsRNAs_ contained 
biotin-dT (red B) on both strands, 
with 28 and 37 indicating position 
from 5' end of sense strand. 
(B)  Single-turnover cleavage 
assays of blocked or unblocked 52 
BLT dsRNA (1 nM) with dmDcr-2"7 
(30 nM), 5 mM ATP and 80 nM 
streptavidin (n = 3). dSRNA was 
pre-incubated with streptavidin 
before adding dmDcr-2™". Arrow, 
AH, as in Fig. 2E. (C) Quantification 
of cleavage at 75 min as in (B). 
dsRNA cleaved (%) is plotted 
based on all products (total), 
those >23 nt, siRNAs (21-23 nt 
products), those 11-20 nts, and 
those <11 nts. Data points, mean + 
SD (n = 3). *P < 0.05; **P < 0.01; 
***P < 0.001; ****P < 0.0001; P > 
0.05, ns. (nonsignificant). 
(D)  Single-turnover cleavage 
assays as in (B) with dmDcr-2™" or 
hsDcr-1“", —/+ 80 nM streptavidin 
(n = 3). Black arrow, siRNA 
product (22 nt) with dmDcr-2""; 
green arrow, siRNA product (26 
nt) with hsDcr-1"". (E) Model for 
recognition and cleavage of BLT 
and 3'’ovr dsRNA by dmbDcr-2. 
Dotted yellow arrow, clamping of 
helicase on BLT dsRNA; dotted 
white arrow, unwinding; dotted 
gray box, dmDer-2R!"eBLT 
dsRNA*ATP-yS complex shown in 
Fig. 1D (also see fig. S10); red 
arrow, cleavage; gray arrow, 
threading intermediates (see text 
and movie S3 for details). Model 
for 3’ovr recognition from data 
reported here and elsewhere (also 
see movie S4) (23, 34). 
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Phosphoethanolamine cellulose: 
A naturally produced chemically 


modified cellulose 


Wiriya Thongsomboon,’ Diego O. Serra,” Alexandra Possling,” Chris Hadjineophytou,””?* 


Regine Hengge,”+ Lynette Cegelski'+ 


Cellulose is a major contributor to the chemical and mechanical properties of plants and 
assumes structural roles in bacterial communities termed biofilms. We find that Escherichia 
coli produces chemically modified cellulose that is required for extracellular matrix assembly 
and biofilm architecture. Solid-state nuclear magnetic resonance spectroscopy of the intact 
and insoluble material elucidates the zwitterionic phosphoethanolamine modification that had 
evaded detection by conventional methods. Installation of the phosphoethanolamine group 
requires BcsG, a proposed phosphoethanolamine transferase, with biofilm-promoting cyclic 
diguanylate monophosphate input through a BcsE-BcsF-BcsG transmembrane signaling 
pathway. The bcsEFG operon is present in many bacteria, including Salmonella species, that 
also produce the modified cellulose. The discovery of phosphoethanolamine cellulose and 
the genetic and molecular basis for its production offers opportunities to modulate its 
production in bacteria and inspires efforts to biosynthetically engineer alternatively modified 


cellulosic materials. 


ellulose is the most abundant biopolymer 

on Earth. Plants rely on the tensile strength 

and mechanical properties of cellulose to 

stand upright (7). Chemically, cellulose is 

a linear polysaccharide composed of B-1,4— 
linked glucosyl residues. Individual strands par- 
ticipate in strong hydrogen-bonding networks with 
neighboring strands and contribute to the physical 
and chemical integrity of plant cell walls and 
cellulosic materials (2). Microorganisms are also 
major producers of cellulose (3). The essential 
genetic and protein machinery for cellulose pro- 
duction in bacteria include the cellulose synthase 
genes, termed bcsA and bcsB, which encode 
cellulose synthase subunits BcsA and BcsB (4). 
BesA is an integral membrane protein containing 
the catalytic active site. BcsB interacts with BcesA 
at the periplasmic face of the inner membrane 
in Gram-negative bacteria, with the two subunits 
forming a channel for cosynthetic secretion of 
cellulose. Cellulose biosynthesis requires activa- 
tion by the ubiquitous bacterial second messenger 
cyclic diguanylate monophosphate (c-di-GMP) (5), 
which directly binds to BcsA (6). Intense curiosity 
has emerged in understanding the diversity of 
additional genes in cellulose biosynthesis operons 
that are present in many microorganisms (3). Here 
we report on the determination of the structure 
of a modified cellulose, phosphoethanolamine 
(pEtN) cellulose, produced naturally by Escherichia 
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coli and other Gram-negative bacteria. We provide 
the genetic basis for its production and the 
functional implications of gene-directed pEtN 
cellulose synthesis. 

E. coli and Salmonella are among the best- 
studied microorganisms reported to produce cellu- 
lose. These include human pathogens such as 
uropathogenic and enterohemorrhagic E. coli. 
Functionally, the exopolysaccharide cellulose is a 
major component of the self-produced extracellular 
matrix in biofilms, which represent physiologi- 
cally heterogeneous and spatially structured bacte- 
rial communities (7, 8). Biofilm formation is of 
high medical relevance, as it confers enhanced 
resistance to antibiotics and host defenses during 
infection (9). Within the biofilm matrix, cellulose 
forms a nanocomposite with amyloid curli fibers 
that encapsulates individual cells in supramolecular 
basketlike structures, enmeshes the bacterial com- 
munity, and confers cohesion and elasticity that 
allow biofilms to fold and buckle up in a tissuelike 
manner (10-12). Biochemical and solid-state nuclear 
magnetic resonance (NMR) measurements with 
the clinically important uropathogenic E. coli 
strain UTI89 established that the matrix was 
composed of curli fibers and cellulosic material 
in a 6:1 ratio by mass. During this bottom-up 
analysis involving °C and ’N NMR analysis of 
the purified components, we also discovered that 
the cellulose portion appears to be modified in 
some way with an aminoethyl functionality (77). 

Solid-state NMR analysis of the intact cellu- 
losic material, complemented by solution-state 
NMR and mass spectrometry analysis of acid- 
digested material, has now enabled the determi- 
nation of the chemical structure of the modified 
cellulose as a polymer containing glucose and 
glucose-6-phosphoethanolamine (Fig. 1A). The #C 
cross-polarization/magic angle spinning (CPMAS) 
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NMR spectrum of isolated cellulosic material 
contains carbon contributions from the glucose 
backbone plus two additional carbons associated 
with the modification (Fig. 1B). A comparison of 
this spectrum with that of the cellulosic material 
isolated with the aid of the dye Congo red (CR) is 
provided in Fig. 1C, as the extraction and yield of 
cellulosic material is enhanced in the presence of 
CR and is sometimes used for the preparation 
of larger samples in subsequent analysis (13). CR 
is also commonly used as a supplement in nutrient 
agar plates for evaluation of E. coli and Salmonella 
community phenotypes because both curli and 
cellulosic polymers bind the dye (/4). The use of 
CR does not result in any changes to the cellulosic 
carbon composition (Fig. 1C). C{N} rotational-echo 
double-resonance (REDOR) NMR was used to 
experimentally select directly bonded carbon- 
nitrogen pairs and identified the 41-parts per 
million (ppm) carbon peak as the C-8 carbon, the 
only carbon directly bonded to nitrogen (fig. S1). 
31> CPMAS revealed the presence of *'P as a 
phosphate in the polymer (fig. S2), and we hypoth- 
esized that the attachment site was at the C-6 
position because of the downfield shift of a car- 
bon in the C-6 region (as in glucose-6-phosphate). 
C{P} REDOR NMR revealed that phosphorous 
was indeed positioned closest to the C-6’ and C-7 
carbons and next closest to the C-5 and C-8 car- 
bons (Fig. 1D), suggesting the full structural assign- 
ment as pEtN cellulose. A °C CP-array NMR 
experiment enabled the quantitative accounting 
of carbon contributions to the pEtN “C CPMAS 
spectrum and determined that approximately 
one-half of the cellulose glucose units in the 
intact polymer are modified (fig. $3). 
Additional solution-based analyses were per- 
formed to complement the solid-state NMR analy- 
sis, although this required acid digestion of the 
pEtN cellulose to release soluble components into 
solution. Solution-state NMR analysis of acid- 
digested pEtN cellulose supported the assignments 
from solid-state NMR and confirmed that the 
modification occurs at the C-6 position. However, 
a standard 48-hour HCl hydrolysis leads to degra- 
dation of the modification, and only soluble glu- 
cose, glucose-6-phosphate, and ethanolamine were 
detected (figs. S4 to 7), observations that explain 
the difficulty of identifying the modified cellu- 
lose using digestions and solution-based methods. 
To attempt to capture an intact modified glucose 
unit, shorter hydrolysis times were used and liquid 
chromatography-mass spectrometry was used for 
analysis. The intact pEtN glucose was detected, 
as well as released glucose, glucose-6-phosphate, 
and ethanolamine (fig. S8). The detection of pEtN 
glucose likely escaped detection in previous research 
because of the instability of the modification upon 
acid digestion and isolation protocols designed 
to specifically detect cellulose. Typical approaches 
do not use the careful purification protocol we 
developed and, instead, rely on cell lysis and 
polysaccharide enrichment followed by harsh 
hydrolysis and either colorimetric detection upon 
reaction with a sulfuric acid solution of anthrone 
(15) or chemical derivatization and mass spec- 
trometry detection (76) to support the presence 
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of significant glucose content as a reporter for 
cellulose production. The presence of glucose-6- 
phosphate, ethanolamine, and other compounds 
could escape detection or be ascribed to residual 
cellular components. 

A biosynthetically modified cellulose has wide- 
ranging implications and potential applications. 
Among these, the specifically modified cellulose 
could be essential for the formation and function 
of bacterial biofilms containing the polymer, could 
exhibit attractive properties for new cellulosic 
materials, and could potentially be introduced 
into other organisms, if gene directed. Thus, we 
sought to identify the genes involved in the in- 
stallation of the cellulose modification. The bcsEFG 
operon, which is part of the cellulose gene cluster 
in E. coli, had not been ascribed a definitive role 
in cellulose synthesis. The °C CPMAS NMR 
comparison of the isolated cellulose from a bcsG 
deletion mutant (AbcsG) revealed that the bcsG 
gene was indispensable for the cellulose modifi- 
cation (Fig. 1E). The spectrum lacks the contri- 
butions from the 41-ppm C-8 carbon and the 
63-ppm C-7 carbon. As expected, the sugar C-6 
carbon contribution appears only at the upfield 
8C position of 63 ppm, corresponding to un- 
modified glucose. Complementation of the AbcsG 
mutant with bcsG on a plasmid restored produc- 
tion of the modified cellulose (fig. S9). The prev- 
alence of the modification was reduced in the 
in-frame nonpolar AbcsF mutant and more strongly 
reduced or abolished in the AdcsE mutant, indicat- 
ing that BesE and BesF may play accessory, and 
possibly regulatory, roles in the installation of 
pEtN by BesG (fig. S10). 

Similar macrocolony phenotypes of nonpolar 
AbcsE, AbcsF, and AbcsG mutants (fig. S11), as well 
as their coexpression from a single operon in the 
bcs gene cluster, also suggested functional coop- 
eration of these three proteins. Notably, BesG and 
BcesF are membrane inserted, whereas BcsE is a 
soluble c-di-GMP-binding protein (77). To further 
elucidate the molecular basis of BcsG function 
and the roles of BcsE and BesF, we tested for 
potential direct interactions between these pro- 
teins as well as with the cellulose synthase subunits 
BesA and BesB. Using a bacterial two-hybrid assay 
that is based on the reconstitution of adenylate 
cyclase from two separate domains fused to pro- 
teins that potentially interact (78), we observed 
strong interactions in vivo of BcsG with BesF as 
well as with BesA (Fig. 2A). Thus, BesG operates 
in close proximity to the cellulose synthesizing 
BcsA-BcsB complex. Moreover, BesF also showed 
strong interaction with BcsE (Fig. 2A), suggest- 
ing a BesE-BcsF-BesG pathway that controls cellu- 
lose modification through direct protein-protein 
interactions. Qualitative evaluation of cellulose 
production by CR binding in £. coli strain 
AR3110AcsgBA, which produces no curli fibers 
but cellulose only, and its AbcsE, AbcsF, and AbcsG 
mutant derivatives additionally revealed com- 
parable CR binding for AR3110AcsgBA and 
AR3110AcsgBAAbesG, but reduced CR binding 
for AR3110AcsgBAAbcsE and AR3110AcsgBAAbcsF 
(fig. S12). The decrease in cellulose production by 
the AbcsE and AbcsF mutants was corroborated 
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by a low yield in isolatable material for NMR anal- 
ysis and could indicate that BcsE and BesF con- 
tribute to stability of the BcsA-BcsB machinery in 
AR3110, influencing efficiency of cellulose synthesis. 

BesG is composed of 559 amino acids and has 
been predicted to be an integral membrane protein. 
A hydropathy plot analysis of BcsG supported 
the presence of several putative transmembrane- 
spanning regions in the N-terminal 160 amino 
acids followed by a large hydrophilic C-terminal 
domain. However, charge distribution flanking 
the hydrophobic amino acid stretches did not 
allow us to unequivocally predict the number 
and orientation of the transmembrane regions 
(19) and thus the localization of the C-terminal 
domain. Therefore, we generated a set of trans- 
lational reporter fusions to B-galactosidase (LacZ) 
and alkaline phosphatase (PhoA) to identify cyto- 
plasmic and periplasmic regions of BcsG. This 
approach is based on the observation that the signal 
sequence-driven attempt to export the normally 
cytoplasmic LacZ results in the jamming of se- 
cretion machinery and interferes with LacZ ac- 
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tivity, whereas PhoA becomes active only upon 
its secretion to the periplasm, where DsbAB- 
mediated disulfide bond formation occurs (20). 
Accordingly, fusions inserted after codon 1 of bcesG 
showed high LacZ activity, but low PhoA activity 
(Fig. 2B). By contrast, when the entire N-terminal 
BesG domain of approximately 160 amino acids— 
including all hydrophobic stretches—was present 
in the hybrid proteins, low LacZ and high PhoA 
activities were observed, consistent with the large 
C-terminal domain of BcsG residing on the peri- 
plasmic side of the inner membrane (Fig. 2, B 
and C). A similar analysis with BesF, a 63-amino 
acid peptide, showed that its N-terminal single- 
transmembrane region, which is preceded by 
negative charges and followed by positive charges, 
inserts into the membrane such that the small 
hydrophilic C-terminal domain remains on the 
cytoplasmic side of the membrane (Fig. 2, B and C). 

We addressed the question of the substrate for 
BcsG-mediated PEtN modification of cellulose 
emerging from the BcsA-BesB complex. We noticed 
that, with respect to overall size (559 and 563 
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Fig. 1. E. coli produces phosphoethanolamine cellulose. (A) Representation of the chemical 
structure of glucose and pEtN glucose units in pEtN cellulose. (B) *C CPMAS solid-state NMR 
spectra of the pure modified cellulose with two additional carbon contributions, C-7 (63 ppm) 

and C-8 (41 ppm). The C-6 carbon appears at 62 ppm for the unmodified glucose units and at 

66 ppm for the modified glucose units (figs. Sl and S4). (C) 8C CPMAS spectra of the modified 
cellulose compared with that of the modified cellulose isolated from cells grown in the presence of CR. 
The pure CR spectrum is provided as an overlay (dashed red line). The comparison demonstrates 

that purification with CR does not influence the polysaccharide composition. 5c, carbon chemical shift. 
(D) The C-6' and C-7 carbon chemical-shift region exhibited the strongest dephasing in the 1-ms C{P} 
REDOR NMR measurement, followed by that of the C-5’ and C-8 carbons, suggesting the full structural 
assignment as pEtN cellulose, further confirmed by solution-state NMR and mass spectrometry 
(figs. S4 to $8). So, REDOR full-echo spectrum; AS, REDOR difference spectrum. (E) The ®C CPMAS 
spectrum of the cellulosic material isolated from the bcsG derivative lacked modification carbons 
and contained only the °C chemical shifts expected for standard amorphous cellulose. 
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Fig. 2. BcsG directly interacts with cellulose synthase and communicates with the c-di-GMP-— 
binding BcsE via the transmembrane peptide BesF. (A) Interactions of the indicated proteins were tested 
using a bacterial two-hybrid (2H) system based on the reconstitution of adenylate cyclase (AC) (18), which 
allows the utilization of maltose by W3110Acya, resulting in red color on MacConkey agar plate. The 2H 
vector plasmids allow the attachment of the respective AC domain tags (18, 25), either at the N terminus 
(pKT25, pUT18c) or the C terminus (pKNT25, pUT18) of a protein. For BcsA, BcsB, BcsE, and BesF, the tags 
were located at the C terminus; for BcsG, the tags were located at the N terminus. Zip-zip, leucine zipper 
domain of the yeast GCN4 protein, used as a positive control. (B) Transmembrane orientation of BesF and 
BcsG was determined by assaying enzymatic activities of hybrid proteins between N-terminal parts from 
BesF and BcesG fused to LacZ and PhoA expressed from low—copy number plasmids in strains W3110Alac 
(I-A) and W3110AphoA. Fusion joints were after codon 1 (all combinations), codon 24 (of besF fused to both 
reporter genes), codon 162 (bcsG:lacZ), or codon 158 (bcsG:;phoA). (©) A schematic model of the directly 
interacting modules for cellulose synthesis (BcsAB) and modification (BcsEFG) summarizes the protein- 
protein interactions (double-headed arrows) detected in (A), the transmembrane orientation of BcsG and 
BesF as tested in (B), and dual control by the second messenger c-di-GMP, which binds to both BcsA (6) and 
BesE (17). NTD, N-terminal domain; CTD, C-terminal domain; UDP-Glc, uridine diphosphate-glucose. 
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Fig. 3. Phosphoethanolamine cellulose production is detected in curli-integrated E. coli biofilm 
matrices with isotopic serine labeling and is also produced by Salmonella enterica. (A) |sotopic 
labeling with L-[3-°C]Ser-supplemented YESCA nutrient medium resulted in enrichment of the 

pEtN cellulose C-7 carbon in an isolated pEtN sample, consistent with routing through a possible 
substrate such as phosphatidylethanolamine. (B) Isotopic labeling with L-[°N]Ser was evaluated 

by ©N CPMAS NMR on extracellular matrix samples containing both curli and cellulosic material. The 
15N-amide signals correspond to curli amides. The loss of the °N-amine signal in the bcsG derivative 
(right) confirmed the amine nitrogen assignment as that from pEtN cellulose. Loss of the modification 
was accompanied by loss of the wrinkled macrocolony morphology (inset photographs). (C) The 

13¢ CPMAS spectrum of the cellulosic material isolated from Salmonella enterica serovar Typhimurium 
strain IR715AcsgBA matched that of pEtN cellulose from AR3110AcsgBA. 
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residues, respectively) and length and transmem- 
brane orientation of domains, BcsG resembles 
EptB, a PEtN transferase that uses the phospho- 
lipid phosphatidylethanolamine (PE) to modify 
bacterial lipopolysaccharide (27). Two similar 
PEtN transferases, EptC from Campylobacter 
jejuni and LptA from Neisseria meningitis, also 
have their active site domains on the periplasmic 
side of the cytoplasmic membrane; they equally 
use PE as a substrate, and their crystal structures 
show two or three histidine residues coordinat- 
ing zinc, which is essential for activity (22). We 
therefore hypothesized that histidine residues in 
BesG may play a similar role and that pEtN modi- 
fication by BesG may also originate from PE. We 
isolated a mutant version of BcsG (BesG™™) in 
which His®®°, His*°°, and His** in the periplas- 
mic domain were replaced by alanine residues. 
BesG™ did not complement the AbcsG mutation 
with respect to macrocolony formation (fig. $13) 
and was inactive with respect to cellulose modi- 
fication (fig. S14), suggesting that BesG enzymatic 
activity resides in the periplasm and is related to 
these known PEtN transferases, despite the absence 
of clear primary-sequence similarity. If BesG also 
uses PE as a substrate, the modified cellulose 
should have atoms derived from serine, which 
serves as a direct precursor for the ethanolamine 
moiety of PE. Thus, pEtN cellulose was prepared 
from cells grown on agar medium supplemented 
with 25 mg/liter of -[3-C]Ser to detect whether 
PEtN cellulose would be enriched through incor- 
poration of the serine label. The expected C-7 carbon 
in the pEtN cellulose spectrum was indeed en- 
hanced as a result of label incorporation from 
serine (Fig. 3A). The routing of serine into the modi- 
fication is consistent with PE serving as a substrate 
for BesG. Labeling with t-[°N]Ser was also suc- 
cessful and is additionally valuable in identifying 
the pEtN cellulose modification in the context 
of complex extracellular matrix samples that 
contain both pEtN cellulose and curli. Labeling 
with t-[°N]Ser yielded the anticipated N-amine 
contribution from pEtN cellulose and the shift- 
resolved curli ’N-amide contributions from ser- 
ine as well as glycine residues that result from 
serine conversion in glycine biosynthesis (Fig. 3B). 
In this way, the potential presence of pEtN cel- 
lulose can be determined in intact extracellular 
matrix preparations from different E.coli strains 
and different organisms. 

Overall, we propose a model in which BesG 
acts as a pEtN transferase with its catalytic domain 
in the periplasm, modifying cellulose after its 
emergence from the BesA-BesB machinery (Fig. 2C). 
Given the association of BcesG with the c-di-GMP- 
binding cytoplasmic protein BcsE and the trans- 
membrane peptide BesF, we also propose that 
cellulose modification is controlled by c-di-GMP 
in a transmembrane signaling pathway that in- 
volves BesE and BesF, with BesF serving as a 
direct link between BcsE and BesG (Fig. 2C). 
Thus, the biofilm-promoting second messenger 
c-di-GMP plays a dual role by activating both the 
synthesis and the modification of cellulose via 
the PilZ domain of BcsA and the BesE-BcesF-BcsG 
transmembrane signaling pathway, respectively. 
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Notably, the Kg values for c-di-GMP binding to 
BesA and BesE are 8 (23) and 2.4 uM (17), re- 
spectively, which ensures efficient modification 
whenever c-di-GMP levels are high enough to 
support cellulose synthesis. 

Production of the biofilm matrix by wild-type 
E. coli involves the coproduction and tight asso- 
ciation of amyloid curli fibers and what has been 
considered to be cellulose (8, 12). Yet, we have 
now determined that FE. coli strains such as strain 
AR3110, which is a direct derivative of the widely 
studied K-12 laboratory strain W3110 (72), as well 
as the classical uropathogenic UTI89 (24) pro- 
duce pEtN cellulose. Thus, we sought to test 
whether this cellulose modification is function- 
ally important for biofilm-matrix architecture and 
function. Macrocolony morphotypes, the matrix 
fine structure as analyzed by in situ fluorescence 
and electron microscopy, and the multicellular 
cohesion of macrocolonies when challenged by 
shear stress were evaluated. BcsG was required 
for the buckling into radial ridges and wrinkles 
typically observed for the otherwise very flat 
AR3110 macrocolonies (Fig. 4A, compare panels i 
and iii). The ring-shaped curli-only-driven macro- 
colony architecture of AR3110AbesG (Fig. 4A, panel 
iii) resembles that of cellulose-free strains such as 
AR3110AbcsA (Fig. 4A, panel ii) or W3110 (72). 
Furthermore, fluorescence microscopy of vertical 
cryosections through macrocolonies grown in 
the presence of thioflavin S, acting as a matrix 
dye that binds to cellulose and cutli, revealed that 
BesG was required for cellulose to assemble into 
long, thick, and straight filaments. These were 
most clearly observed in the absence of curli, i-e., 
in AR3110AcsgB (Fig. 4A, panel iv). Compared to 
these long filaments, only short, thin, and curled 
filaments were detected in AR3110AcsgBAbcsG 
macrocolonies (Fig. 4A, panel v, and fig. S15). 
Scanning electron microscopy further supported 
a role of modified cellulose in matrix architecture. 
The matrix at the surface of a AR3110AbcsG 
macrocolony resembles that of the curli-only strain 
W3110 (Fig. 4B). Thus, cellulose modification is re- 
quired to form the extended composite structure 
with curli fibers that nearly fully covers the sur- 
face of a macrocolony biofilm (as visible with 
strain AR3110 in Fig. 4B). Finally, an extracellular 
matrix consisting of either the cellulose-curli 
nanocomposite or of cellulose alone is known 
to provide cohesion to the cellular community. 
The resulting tissuelike behavior includes the 
ability not only to fold and buckle up but also 
to resist shear stress as a cohesive community 
(12). The latter can be tested by submerging a 
macrocolony in liquid and exposing it to gentle 
shaking. In this assay, the curli-free AR3110AcsgBA 
strain detached from the agar phase as an entire 
macrocolony, whereas colonies of the corresponding 
cellulose modification-deficient strain just dis- 
solved into flares of loose cells (fig. S16). Thus, 
pEtN modification of cellulose is required for 
community behavior based on the formation of 
the connective matrix consisting of either long- 
range cellulose fibers or the curli-cellulose nano- 
composite network that envelops and connects 
bacterial cells during biofilm formation. 
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The production of pEtN cellulose may also 
have consequences for virulence of pathogenic 
strains. Amyloid curli fibers are proinflamma- 
tory (25, 26) and a virulence factor for various 
types of pathogenic E. coli (27, 28), yet the tight 
association of curli with “cellulose” was reported 
to counteract these properties (27, 29). Therefore, 
mutations in the bcsEFG operon resulting in 
nonmodified cellulose that cannot form the nano- 


Ai 


composite with curli fibers may enhance the con- 
tribution of curli to virulence of pathogenic E. coli. 
Consistent with this notion, the 2011 European 
outbreak 0104:H4 strain, an enteroaggregative 
and enterohemorrhagic E. coli which not only 
produced Shiga toxin but also high amounts of 
curli at 37°C while being a bcsE mutant (30), was 
of unprecedented virulence for this pathotype of 
E. coli (3D). 


AbcsG AcsgB 


AR3t 4 


Fig. 4. Eliminating BcsG changes macroscopic morphology and microscopic matrix architecture 
of E. coli macrocolony biofilms. (A) Macrocolonies of strain AR3110, which produces both cellulose 
and curli fibers, and the indicated mutant derivatives (i to v) were grown for three days on salt-free 
LB agar plates containing either Congo red or the green-fluorescent thioflavin S, which stain cellulose 
and curli without affecting the overall matrix architecture and colony morphotype. The microscopic 
architecture of thioflavin S-stained matrix was visualized in thin cross sections of macrocolonies, with 
color-coded boxed areas being further enlarged adjacently. (B) The surface of macrocolonies was 
visualized at high resolution by scanning electron microscopy. The classical E. coli K-12 lab strain W3110 
is isogenic to AR3110, except for a besQ*? mutation, which eliminates the ability to produce cellulose 
(12). Because of polarity, the AcsgB mutation also eliminates the expression of CsgA (from the csgBA 
operon), i.e., both curli subunits encoded by csgBA are not produced. 
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Finally, together with core cellulose genes, bcsEFG 
genes have been shown to occur in many y- and 
B-proteobacteria (3). We isolated the cellulose 
material from Salmonella enterica serovar Typhi- 
murium strain IR715AcsgBA (32), a curli mutant, 
and discovered that it also produces pEtN cellu- 
lose. The “C CPMAS NMR spectrum of isolated 
cellulose from Salmonella matches that of pEtN 
cellulose from E. coli (Fig. 3C). Thus, the pEtN 
modification of cellulose is likely to be common 
in the y and f branches of proteobacteria. Nota- 
bly, however, Komagataeibacter xylinus (for- 
merly known as Acetobacter xylinum), which 
produces excessive amounts of cellulose such 
that PEtN modification would predictably lead 
to a depletion of the headgroups of the phospho- 
lipid membrane, does not possess the bcsEFG genes 
(3). Other bacterial species that do not possess 
bcsEFG genes, but feature accessory bcs genes of 
unknown function, could possibly use alternative 
modes of cellulose modification. 

Modified cellulose is produced by strains that 
have been assumed in the literature to be producing 
standard amorphous cellulose on the basis of 
simple calcofluor-white staining procedures and 
conventional isolation methods designed for the 
detection of glucose from hydrolyzed cellulose. 
However, these methods involve harsh hydroly- 
sis protocols and crude purification or enrichment 
methods, followed by chromatography and mass 
spectrometry, and, thus, a complete accounting of 
the intact material has not been attempted. Solid- 
state NMR analysis of the relevant intact poly- 
saccharide was able to identify this biologically 
important pEtN modification that evaded detec- 
tion by conventional approaches. PEtN cellulose 
is anewly identified zwitterionic polymer, and, to 
our knowledge, our study provides the first de- 
finitive evidence so far of a naturally postsyntheti- 
cally modified cellulose. In the extracellular matrix 
of bacterial biofilms, pEtN modification of cellu- 
lose seems to be multifunctional: It is required for 
the formation of long cellulose fibrils and a tight 
nanocomposite with amyloid curli fibers that 
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generate tissuelike cohesive and elastic behav- 
ior of biofilms, it may confer resistance against 
attacks by cellulase-producing microorganisms 
(e.g., fungi) in the environment, and it can prevent 
amyloid curli fibers from hyperstimulating immune 
responses, which, in the long run, may contrib- 
ute to pathogen fitness in the host. Moreover, 
cellulose biosynthesis and postsynthetic modi- 
fication are coregulated by the ubiquitous biofilm- 
promoting second messenger c-di-GMP via a 
transmembrane c-di-GMP signaling pathway. 
Inhibition of BesG could offer new opportuni- 
ties to control biofilm formation, in particular 
by Gram-negative pathogens associated with 
chronic infections. Furthermore, the identifica- 
tion of the gene-directed biosynthetic machinery 
also inspires the generation of engineered sys- 
tems to produce alternately modified cellulosic 
materials. 
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Accurate chromosome segregation requires the proper assembly of kinetochore proteins. A key step in 
this process is the recognition of the histone H3 variant CENP-A in the centromeric nucleosome by the 
kinetochore protein CENP-N. We report cryo-EM, biophysical, biochemical, and cell biological studies of 
the interaction between the CENP-A nucleosome and CENP-N. We show that human CENP-N confers 
binding specificity through interactions with the L1 loop of CENP-A, stabilized by electrostatic 
interactions with the nucleosomal DNA. Mutational analyses demonstrate analogous interactions in 
Xenopus, which is further supported by residue-swapping experiments involving the L1 loop of CENP-A. 
Our results are consistent with co-evolution of CENP-N and CENP-A, and establish the structural basis for 
recognition of the CENP-A nucleosome to enable kinetochore assembly and centromeric chromatin 


organization. 


The correct transfer of genetic material from mother to 
daughter cells requires the accurate segregation of 
chromosomes during mitosis. Mis-segregation of 
chromosomes can lead to aneuploidy, a hallmark of cancer 
(1). Chromosome segregation requires the assembly of 
kinetochore proteins at the centromere, which is 
epigenetically marked by nucleosomes containing the histone 
H3 variant CENP-A (2). The centromeric nucleosome 
associates with a group of 16 inner kinetochore proteins, 
collectively known as the constitutive centromere associated 
network (CCAN) (3). Among the CCAN proteins, CENP-C and 
CENP-N can directly recognize the CENP-A nucleosome core 
region and are essential for kinetochore assembly and 
chromosome segregation (4). Two regions of CENP-C, the 
central region and the CENP-C motif, bind the CENP-A 
nucleosome by interacting with the C-terminal tail of CENP- 
A and the acidic patch on histones H2A and H2B in the 
centromeric nucleosome (5, 6). The CENP-N N-terminal 
region (residues 1-286 in human) is responsible for 
recognition of the CENP-A nucleosome (7, 8) while its C- 
terminal region (residues 287-339) interacts with the CCAN 
through CENP-L (9, 10). The recruitment of CENP-N to the 
centromere requires recognition of the L1 loop of CENP-A 
(1). 

To determine structural aspects of the binding between 
CENP-A and CENP-N, we carried out cryo-EM analyses of the 
complexes formed between residues 1-286 of human CENP-N 
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(tagged at its N terminus with Hiss and maltose binding pro- 
tein (MBP); hereafter referred to as hCENP-N;.236) and the 
CENP-A nucleosome containing “601” DNA (figs. S1 and S2). 
CENP-N binding to the CENP-A nucleosome is independent 
of DNA sequence (8), and we observed that CENP-A nucleo- 
somes containing either “601” DNA or human centromeric a- 
satellite DNA bound hCENP-Nj.2.8s5 with similar affinity and 
stoichiometry (fig. S3). When excess hCENP-Nj.3s6 was used, 
CENP-N/CENP-A nucleosome complexes with 1:1 or 2:1 stoi- 
chiometry are observed in the cryo-EM 2D class averages (fig. 
S4 and table S1, dataset 1). The complex with 2:1 stoichiome- 
try is a smaller fraction of the overall population, and the 
density observed for the second site is weaker than at the first 
site. In the absence of excess hCENP-N}.236, most of the 2D 
class averages displayed only one molecule of hCENP-Nj.2s:6 
bound to the CENP-A nucleosome (fig. S5C). We refined this 
population to obtain a 3D reconstruction at an overall reso- 
lution of 3.9 A for the complex formed between hCENP-N}-2¢6 
and the CENP-A nucleosome (Fig. 1; figs. S5 and S6; and table 
SI, dataset 2). 

The density map shows an identifiable nucleosome and a 
bi-lobed assembly corresponding to hCENP-N}.23¢ (Fig. 1, A to 
C). The resolution of this map is highest in the core of the 
CENP-A nucleosome and lower in the periphery and in the 
bound CENP-N (Fig. 1D and fig. S6). The map of the CENP-A 
nucleosome core is at a resolution (~3.5 A) adequate to inter- 
pret directly in terms of an atomic model since side-chain 
densities are clearly visualized (Fig. 1D and fig. $7). The final 
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structure of the CENP-A nucleosome core is similar to the 
crystal structure of the free CENP-A nucleosome (PDB 3AN2) 
(12). While the DNA near the entry and exit sites can be visu- 
alized (fig. S8) and is displaced further away from the core 
histones in comparison with the canonical nucleosome struc- 
ture (PDB 3LZO) (13), the significance of these minor differ- 
ences with previously reported structures is currently 
unclear. 

The map of the CENP-N region is at a lower resolution 
(~4.5 A; fig. S6, C to F). At this resolution, densities for several 
a-helical regions as well as B-strands are well-resolved, but 
since side chain densities are not delineated, it is not possible 
to thread the sequence unambiguously into the density map. 
In order to identify CENP-N residues that may form the in- 
terface with the CENP-A nucleosome, we carried out an ex- 
tensive, iterative process to predict and fit various stretches 
of secondary structure elements and subdomains into the 
density map (see Methods). These efforts resulted in a pre- 
dicted molecular model with a sequence assignment based on 
the modeled subdomains (figs. S9 and S10 and table S1). From 
this model, we identified residues likely to form the interface 
with the CENP-A nucleosome. 

In our model, residues 1-185 of hCENP-N are organized 
into distinct N-terminal and central domains (Fig. 1 and figs. 
S9 and S10). The MBP-moiety, which is attached to the N ter- 
minus of hCENP-N}.2.36 through a di-peptide (Ala-Ala) linker, 
does not interfere with the hCENP-N/CENP-A nucleosome in- 
teraction based on binding studies (fig. $3, C to G). The ap- 
parent Kp values for MBP-tagged and untagged hCENP-N}.286 
are 0.042 + 0.007 pM and 0.046 + 0.008 uM, respectively. 
This is consistent with our assignment of the location of MBP 
to the weak density to the side of hCENP-N}-2s6 that is distal 
from the core (Fig. 1C and figs. SSE and S6A). There is un- 
structured density consistent with regions beyond residue 
185 of hCENP-N, but they could not be assigned presumably 
due to conformational flexibility (fig. S9, A and B). The N- 
terminal hCENP-N domain, comprising residues 1-81, is com- 
posed of five antiparallel a-helices that show structural con- 
servation to proteins of the death domain superfamily, while 
the central domain comprising residues 101-185 contains 
both « helices and 8 strands with similarity to regions of the 
appendage domain of 82 adaptin (fig. S9, F to H) (4, 15). 

The structure of the complex shows that hCENP-N}.2s86 has 
direct interactions with the DNA of the nucleosome, as well 
as with the L1 loop of CENP-A (Fig. 2A and fig. S11, A and B). 
We identified several positively charged residues of hCENP- 
N proximal to nucleosomal DNA that likely form stabilizing 
interactions with the backbone phosphates of DNA (Fig. 2B). 
Consistent with our structural model, mutations of hCENP-N 
residues R11, K15 and K45 have already been shown to reduce 
its binding to the CENP-A nucleosome, and further, mutation 
of R11 disrupts centromeric localization and the recruitment 
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of other CCAN proteins in human cells (8). To further verify 
the role of these and other residues identified at the interface, 
we constructed hCENP-N;i.s5 mutants in the N-terminal do- 
main (RIIA, R44A, K45A, K81A) and in the central domain 
(K148A, R169A, and R170A) and analyzed the binding by elec- 
trophoretic mobility shift analysis (EMSA); these mutations 
impair hCENP-N}.86 binding to the CENP-A nucleosome (Fig. 
2C). Mutation of residues H77, H79, or K90, which are located 
on the linker connecting the N-terminal and central domains, 
also resulted in decreased binding affinity to various extents 
(fig. S12A), likely due to perturbations of CENP-N tertiary 
structure. As controls, we also tested the effects of mutating 
several positively charged residues predicted to be distal from 
the nucleosomal DNA (R60A, K109A/K110A, and K143A) or 
in residues C-terminal to the central domain (K207A, R236A, 
and R250A). All of these mutations showed minimal effects 
on hCENP-N;,.286 binding affinity (Fig. 2D and fig. S12A) (7). 

To examine the physiological significance of CENP-N in- 
teractions with the nucleosomal DNA, we mutated conserved, 
and analogous residues within Xenopus laevis CENP-N 
(xCENP-N) (fig. S13A), and tested their effects on the centro- 
meric localization of exogenously expressed MBP-xCENP- 
N/CENP-L complex in interphase Xenopus egg extracts (fig. 
$14). Mutation of xCENP-N residues R29, K99 and K165 
(which correspond to residues R11, K81 and K148 in hCENP- 
N) led to a significant reduction in the localization of the 
xCENP-N/L complex to centromeres (Fig. 2, E and F). Muta- 
tion of K108 (K90 in hCENP-N) in the linker region caused 
minor defects in localization (fig. S14, E and F). Furthermore, 
mutation of K223 (K207 in hCENP-N), which is C-terminal to 
the central domain, did not diminish centromeric localiza- 
tion (fig. $14, E and F). These results are in agreement with 
our findings on hCENP-N (Fig. 2 and fig. S141, and suggest 
that the interactions with nucleosomal DNA identified in the 
structure of the hCENP-Nj-2s6/CENP-A nucleosome complex 
are conserved across Xenopus and human centromeres. 

In our structural model, the N-terminal helix of hCENP- 
Nj.286 is in contact with the L1 loop of CENP-A, in which resi- 
dues R80 and G81 constitute an insertion compared to canon- 
ical H3.1 (fig. S11B) (12). Previous studies have shown that 
hCENP-N fails to localize to the centromere upon a double 
mutation of R80A and G81A in CENP-A (/J), and substitution 
of the centromere targeting domain (CATD) of CENP-A in- 
cluding the L1 loop and a2 helix to the corresponding region 
in H3 (H3") is sufficient for recruitment of CENP-N (8, 16). 
We further corroborated these findings by in vitro mutational 
analysis (fig. S12B), and by cryo-EM structure determination 
of the hCENP-N1286/H3.1" chimeric nucleosome complex 
(fig. S15 and table S1), which revealed that hCENP-N}.286 bind- 
ing to the chimeric nucleosome is structurally similar to its 
binding to CENP-A containing nucleosome (fig. SI5E). In our 
structural model of the HhCENP-Niss6/CENP-A nucleosome 
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complex, conserved residues E3 and E7 from helix 1 of 
hCENP-N are proximal to R80 of CENP-A, and residue T4 of 
hCENP-N is close to V82 of CENP-A (Fig. 3, A and B, and fig. 
S11). The structure thus suggests that both electrostatic and 
hydrophobic interactions are involved at the interface. Gel 
shift assays show that the E3A/E7A and T4A mutants of 
hHCENP-Nj}.23s6 Show reduced binding to the CENP-A nucleo- 
some, while the E3K, E7K, E3K/E7K and E3A/T4A/E7A mu- 
tants show aberrant binding (Fig. 3C and fig. S12C). Further, 
the E3K/E7K and E3A/T4A/E7A mutants exhibit loss of spec- 
ificity for CENP-A over H3.1 nucleosomes (fig. S12, D and E). 
Together, these results are consistent with a model in which 
a small number of residues of hCENP-N that interact with the 
Ll loop of CENP-A provide interaction specificity, while 
broad electrostatic interactions with the backbone phos- 
phates of DNA enhance binding affinity. 

To further evaluate the physiological significance of 
CENP-N interactions with the L1 loop of CENP-A, we exam- 
ined the effects of the xCENP-N double mutants E21A/E25A 
and E21K/E25K (which correspond to mutations E3A/E7A 
and E3K/E7K in hCENP-N) on centromeric localization in 
Xenopus egg extracts (Fig. 3, D and E, and fig. S13A). Com- 
plete loss of centromeric localization was observed in the 
E21K/E25K mutant, but not in the E21A/E25A mutant (Fig. 3, 
D and E£, and fig. S14C). Furthermore, the W22A mutation in 
xCENP-N (corresponding to mutation of T4 in hCENP-N) se- 
verely disrupted centromeric localization (Fig. 3, F to H, and 
fig. S14D). Thus, xCENP-N localization appears to depend 
more on hydrophobic rather than electrostatic interactions 
as compared to hCENP-N, where the E3A/E7A mutation leads 
to a total loss of centromeric localization in human cells (fig. 
S14, G to I). Our results support a model in which residues 
within helix 1 of hCENP-N (E3, T4, and E7) are critical for the 
recognition of the L1 loop within the CENP-A nucleosome. 

CENP-A and centromeric DNA sequences show evidence 
for rapid evolution (17, 18), and alignments of the CENP-N N- 
terminal region and the CENP-A L1 loop across eukaryotes 
show a high degree of variability (fig. S13, B to D). To test if 
CENP-N may have co-evolved with CENP-A to maintain ki- 
netochore assembly, we created a chimeric CENP-A nucleo- 
some in which R80 and V82 of hCENP-A were respectively 
replaced by Cys and Met to mimic the Xenopus CENP-A L1 
loop sequence (Fig. 3F). Wild-type hCENP-N does not interact 
with the chimeric nucleosome (Fig. 31), in agreement with 
decreased centromeric localization of a W22T xCENP-N mu- 
tant (fig. S14, E and F). However, chimeric nucleosome bind- 
ing could be enabled by a complementary hCENP-N T4W 
mutant, which also retained its affinity for wild-type hCENP- 
A nucleosomes (Fig. 31). Together, these results suggest that 
CENP-N has likely co-evolved with CENP-A to enable kineto- 
chore assembly utilizing hydrophobic and electrostatic inter- 
actions to recognize centromeric chromatin across different 
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species. 

Centromere identity and function are specified by a two- 
step mechanism wherein the deposition of CENP-A at the 
centromere by the CENP-A chaperone HJURP allows for the 
recognition of CENP-A by the kinetochore proteins CENP-C 
and CENP-N (19). Interestingly, structures of HJURP (20-22), 
CENP-C fragments (5) and CENP-N (this study) in complex 
with different forms of CENP-A show that in all three cases, 
only a small number of residues are involved in the specific 
recognition of CENP-A while additional interactions are used 
for further stabilization (Fig. 4A, and fig. S11, C and D). These 
residues on CENP-A have both conserved and necessary func- 
tions, despite the variability of their amino acid sequence 
across eukaryotes. Our data suggest the CENP-A-interacting 
residues of CENP-N co-evolve with CENP-A, which could re- 
flect a common strategy for maintaining CENP-A specificity 
(17). Previous studies (7, 23) and our results (Fig. 4B and fig. 
$2, A and D) show that CENP-N and fragments of CENP-C 
that include one of its two CENP-A nucleosome binding do- 
mains (fig. S2E; central domain (7, 23); or CENP-C motif (this 
work)) can simultaneously bind to the same face of the CENP- 
A nucleosome. Furthermore, the kinetochore protein CENP- 
L has been shown to mediate interactions between CENP-C 
and CENP-N (9, 23-25). Thus, the valency of CENP-C-CENP- 
N-CENP-L interactions could facilitate clustering of sparse 
and non-adjacent CENP-A nucleosomes (Fig. 4C) (9, 26), 
which might help establish the folding of centromeric chro- 
matin (27, 28), and/or the integrity of the kinetochore (29, 
30). 
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Fig. 1. Structure of the human CENP-N/CENP-A nucleosome 
complex. (A) Cryo-EM density map of the NHCENP-Ni286/CENP-A 
nucleosome complex viewed down the axis of the DNA supercoil. (B) 
Schematic of the functional domains of CENP-N known to bind the 
CENP-A nucleosome (gray) and CENP-L (black) (top panel). The CENP- 
N construct used for the present structural analysis (HCENP-N1.286), and 
the regions of the sequence whose structure we report here (N-terminal 
domain: 1-81, and central domain: 101-185; HCENP-Ni.185) are shown in 
the middle and bottom panels, respectively. (C) Cryo-EM density map of 
the hCENP-Nj-286/CENP-A nucleosome complex as viewed from the side, 
at an orientation 90° to the view shown in (A). This view also depicts the 
extra density connected to the N-terminal domain that we assign to 
MBP, shown with lighter shading. (D) Representative regions of the cryo- 
EM density map to illustrate map quality (from left to right) for canonical 
histones H2A, H2B, and H4, centromere specific H3 variant CENP-A, 
nucleosomal DNA, and CENP-N. 
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Fig. 2. Interaction of CENP-N with nucleosomal DNA. (A) Cut-away view of the HCENP-Ni28s/CENP-A 
nucleosome model to highlight interfaces involved in complex formation (please also see fig. S11, A and B). For 
the CENP-N/DNA interface (labeled “la” and “1b”), nucleosomal DNA is shown as a red ribbon, while positively 
charged residues of CENP-N that are proposed to interact with it are shown as blue spheres. For the CENP- 
N/CENP-A interface (labeled “2”), CENP-A residues (R80-G81-V82) are marked by the short yellow ribbon 
while interacting CENP-N residues (E3, T4 and E7) are shown as yellow spheres. (B) View of the CENP-N/DNA 
interface at different magnifications to highlight details of interactions between the nucleosomal DNA and 
positively charged residues of CENP-N. (C) Gel mobility shift experiment to examine the effects of CENP-N 
mutations (indicated at top of the gel) on binding to the CENP-A nucleosome. Impaired binding is reflected by 
increased intensity of the free nucleosome band, concomitant with the disappearance of defined 1:1 and 2:1 
bands. Labels: N, indicates migration position of the free CENP-A nucleosome; “1” and “2”, indicate migration 
positions of CENP-A nucleosomes bound with either one or two molecules of CENP-N, respectively. (D) Similar 
analysis to that in (C), carried out with a set of CENP-N mutations involving residues distal from the binding 
interface. (E) Images of interphase nuclei in Xenopus egg extracts with exogenous MBP-xCENP-N and xCENP- 
L protein containing the indicated mutations (with analogous human mutations in parentheses), stained with 
an antibody for MBP (green) and Hoechst (blue). (F) Centromeric MBP fluorescence intensity normalized as a 
percentage of that observed for wildtype MBP-xCENP-N. Error bars represent SEM (n > 200 centromeres). 
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Fig. 3. Interaction between the L1 loop of CENP-A and helix 1 of CENP- 
N. (A and B) Overall (A) and close-up (B) view of the hCENP-N1.286/CENP-A 
interface formed between R80-G81-V82 on the Ll loop of CENP-A, and E3, 
T4, and E7 on helix 1 of CENP-N. (C) Gel mobility shift experiment to 
examine effects of CENP-N mutations (indicated at top of the gel) on 
binding to the CENP-A nucleosome. (D) Images of interphase nuclei in 
Xenopus egg extracts with exogenous MBP-xCENP-L and xCENP-L protein 
containing the indicated mutations of xCENP-N residues E21 and E25 
(corresponding to residues E3 and E7 in hCENP-N), stained with an 
antibody for MBP (green) and Hoechst (blue). (E) Centromeric MBP 
fluorescence intensity normalized as a percentage of that observed for 
wildtype MBP-xCENP-N. Error bars represent SEM (n > 200 centromeres). 
(F) Alignment of human and Xenopus laevis sequences corresponding to 
the L1 loop of CENP-A and helix 1 of CENP-N. Closely interacting segments 
of the L1 loop of CENP-A and helix 1 of CENP-N are highlighted by shading. 
The asterisks indicate conserved Glu residues (black asterisks) and 
variability in the hydrophobic residue corresponding to position T4 (red 
asterisk) of human CENP-N. (G) Images of interphase nuclei in Xenopus egg 
extracts with exogenous MBP-xCENP-N and xCENP-L protein containing 
the indicated mutations of xCENP-N, as in (D). (H) Centromeric MBP 
fluorescence intensity, determined as in (E). (1) Gel mobility shift 
experiment to examine the effects of correlated amino acid substitutions 
between the L1 loop of CENP-A and helix 1 of CENP-N. 
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Fig. 4. Structural determinants of kinetochore assembly on the CENP-A nucleosome. (A) Sequence 
alignment between human H3.1 and CENP-A to highlight unique CENP-A motifs involved in deposition and 
recognition of CENP-A at centromeric chromatin (also see fig. S11, C and D). (B) Two different views of the 
CENP-A nucleosome bound to hCENP-N and a modeled CENP-C motif peptide (5) to highlight potential dual 
binding of full-length CENP-C and CENP-N proteins on the CENP-A nucleosome. The second CENP-N (shown 
with lighter shading) is modeled based on the cryo-EM density map obtained in the presence of excess hCENP- 
Ni-2e6 (fig. S4), while the CENP-C motif peptides (human numbering shown for clarity) on each face of the 
nucleosome are positioned based on the crystal structure of the nucleosome in complex with the rat CENP-C 
motif (5). (C) Schematic view to highlight recognition and possible enrichment of CENP-A nucleosomes by the 
CCAN proteins, CENP-C, CENP-N, and CENP-L. Other kinetochore proteins, and the dimerization of CENP-C, 
have been omitted for clarity. 
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Multiplexed gene synthesis in 
emulsions for exploring protein 
functional landscapes 


Calin Plesa,!* Angus M. Sidore,?* Nathan B. Lubock,! Di Zhang,® Sriram Kosuri’*+ 


Improving our ability to construct and functionally characterize DNA sequences would broadly 
accelerate progress in biology. Here, we introduce DropSynth, a scalable, low-cost method to 
build thousands of defined gene-length constructs in a pooled (multiplexed) manner. DropSynth 
uses a library of barcoded beads that pull down the oligonucleotides necessary for a gene’s 
assembly, which are then processed and assembled in water-in-oil emulsions. We used 
DropSynth to successfully build more than 7000 synthetic genes that encode phylogenetically 
diverse homologs of two essential genes in Escherichia coli. We tested the ability of 
phosphopantetheine adenylyltransferase homologs to complement a knockout E. coli strain in 
multiplex, revealing core functional motifs and reasons underlying homolog incompatibility. 
DropSynth coupled with multiplexed functional assays allows us to rationally explore sequence- 
function relationships at an unprecedented scale. 


he scale at which we can build and func- 

tionally characterize DNA sequences sets 

the pace at which we explore and engineer 

biology. The recent development of multi- 

plexed functional assays allows for the facile 
testing of thousands to millions of sequences 
for a wide array of biological functions (J, 2). 
Currently, such assays are limited by their ability 
to build or access DNA sequences to test. Natural 
or mutagenized DNA sequences (3, 4) allow for 
large libraries but are not easily programmed 
and thus limit hypotheses, applications, and 
engineered designs. Alternatively, researchers 
can use low-cost microarray-based oligo pools 
that allow for large libraries of designed ~200- 
nucleotide (nt) sequences (5), but their short 
lengths limit many other applications. Gene syn- 
thesis is capable of creating long-length sequences, 
but high costs currently prohibit building large 
libraries of designed sequences (6-9). 

We developed a gene synthesis method we 
term DropSynth: a multiplexed approach capable 
of building large pooled libraries of designed gene- 
length sequences. DropSynth uses microarray- 
derived oligo libraries to assemble gene libraries 
at vastly reduced costs. We and others have 
developed robust parallel processes to build genes 
from oligo arrays, but because each gene must be 
assembled individually, costs are prohibitive 
for large gene libraries (6, 10). In these efforts, 
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the ability to isolate and concentrate DNA from 
the background pool complexity was paramount 
for robust assemblies (77). Previous efforts to 
multiplex such assemblies have not isolated re- 
actions from one another and thus suffered from 
short assembly lengths, highly biased libraries, 
the inability to scale, and constraints on sequence 
homology (12-15). 

DropSynth works by pulling down only those 
oligos required for a particular gene’s assembly 
onto barcoded microbeads from a complex oligo 
pool. By emulsifying this mixture into picoliter 
droplets, we isolate and concentrate the oligos 
before gene assembly, overcoming the critical 
roadblocks for proper assembly and scalability 
(Fig. 1A and movie S1). The microbead barcodes 
are distinct 12-nt sequences that all oligos for 
a particular assembly share, and pair with com- 
plementary strands displayed on the microbead. 
Within each droplet, sequences are released from 
the bead by using Type IIs restriction enzyme 
sites and assembled through polymerase cycling 
assembly (PCA) into full-length genes. Last, the 
emulsion is broken, and the gene library is re- 
covered. To test and optimize the protocol, we 
built model assemblies that were different but 
shared common overlap sequences. As a result, 
any contaminating oligo would still participate 
in the assembly reaction, allowing us to monitor 
assembly specificity and library coverage. We 
optimized each aspect of the protocol by trying 
to assemble 24-, 96-, and 288-member libraries 
composed of 3, 4, 5, and 6 oligos at once, based 
on how often we saw intended targets versus 
their expected frequency given random (bulk) 
assembly (Fig. 1B). Over many iterations, we 
achieved high enrichment rates (~10*) by modify- 
ing the amount of beads, presence of size selec- 
tion after assembly, ligase used for capture, and 
bead attachment chemistry. We ultimately found 
that using streptavidin bead chemistry, Taq ligase 
for bead capture, and size-selection after assembly 
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yielded the highest enrichment rates. Using these 
protocols, we were able to build libraries of up 
to six oligos that produced correctly sized bands 
(Fig. 1C), and the resulting assembly distributions 
were not overly skewed (Fig. 1D and fig. S1). 

To test the scalability of DropSynth, we at- 
tempted assembly of 12,672 genes ranging in 
size from 381 to 669 base pairs (bp) that encode 
homologs of two bacterial proteins from across 
the tree of life (Fig. 2A and fig. $2). A total of 
33 libraries of 384 genes each encoded 5775 
homologs of dihydrofolate reductase (DHFR) 
with two different codon usages (11,520 DHFR 
genes), as well as 1152 homologs of the enzyme 
phosphopantetheine adenylyltransferase (PPAT) 
(fig. S3, A and B). DHFR genes were assembled 
from either four or five 230-nt oligos, whereas 
PPAT genes were assembled from five 200-nt 
oligos. We obtained correctly sized bands for 31 
of 33 assemblies, with one failing because of 
oligo amplification issues and the other because 
of low yield on the oligo processing steps, in 
contrast to attempts using bulk assembly that 
produced shorter failed by-products (fig. S3C). 
Three of the libraries (5x 230-nt oligomers) were 
too long to verify by using our barcoding approach, 
but the resulting synthesis showed correct band 
formation (fig. $4). 

We cloned the libraries into an expression 
plasmid containing a random 20-bp barcode 
(assembly barcode) and sequenced the remain- 
ing 28 libraries consisting of 10,752 designs (figs. 
S3D, S4, and S5). For the PPAT 5x 200-nt oligo 
assemblies, sequencing revealed that a total of 
872 genes (75%) had assemblies corresponding 
to a perfect amino acid sequence represented by 
at least one assembly barcode, with a median of 
two reads per assembly barcode and 56 as- 
sembly barcodes per homolog (Fig. 2B and fig. 
S6, A and B). This coverage increased when 
including sequences with deviations from the 
designed sequences, with 1002 genes (87%) rep- 
resented within five amino acids from the 
designed sequences (all homologs have some 
alignments regardless of distance) (fig. S6D). 
For the DHFR 4x 230-nt oligo assemblies, we 
observed perfect sequences for 65% (6271) of the 
designed homologs, and 75% have at least one 
assembly within a two-amino-acid difference 
from design. Because there are two codon usages 
per homolog, when combined over homologs 
we observed that 3950 (79%) have at least one 
perfect, and 88% have at least one assembly in 
a distance of two amino acids (Fig. 2C). We see 
a strong correlation [Pearson correlation co- 
efficient (p) = 0.73, P value = 3.4 x 10°°] be- 
tween the amount of DNA used to load the 
DropSynth beads and the resulting library cov- 
erage (fig. S7A). We also found 15 microbead 
barcodes that have more dropouts than would 
be expected by chance (fig. S7B). For constructs 
with at least 100 assembly barcodes, we observed 
a median of 1.9% (o = 2.9%) and 3.9% (o = 
3.8%) perfect protein assemblies (Fig. 2A and 
figs. S6C and S8) for PPAT and DHFR libraries, 
respectively. The nearly double the rate of perfects 
for DHFR libraries compared with PPAT can be 
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attributed to using longer oligos (230 versus 
200 nt) that only require four oligos instead of 
five to assemble the gene (fig. S9A). Increasing the 
oligo length provides a way to assemble longer 
genes without large decreases in the resulting 
yields (fig. S9B). Furthermore, the distribution of 
perfect assemblies in the PPAT libraries is not 
overly skewed (fig. S6D), and most library mem- 
bers have assemblies with high identity to their 


Fig. 1. DropSynth assembly and optimization. 
(A) We amplified array-derived oligos and 
exposed a single-stranded region that acts as a 
gene-specific microbead barcode. Barcoded 
beads display complementary single-stranded 
regions that selectively pull down the oligos 
necessary to assemble each gene. The beads are 
then emulsified, and the oligos are assembled by 
means of PCA. The emulsion is then broken, and 
the resultant assembled genes are barcoded and 
cloned. (B) We used a model gene library that 
allowed us to monitor the level of specificity and 
coverage of the assembly process. We then 
optimized various aspects of the protocol— 
including purification steps, DNA ligase, and 
bead couplings—in order to improve the speci- 
ficity of the assembly reaction. Enrichment is 
defined as the number of specific assemblies 
observed relative to what would be observed by 
random chance in a full combinatorial assembly. 
(C) We attempted 96-plex gene assemblies with 
three, four, five, or six oligos, and the resultant 
libraries displayed the correct-sized band on an 
agarose gel. (D) The distribution of read counts 
for all 96 assemblies (four-oligo assembly) as 
determined with NGS. 


Fig. 2. DropSynth assembly of 10,752 genes. 
(A) We used DropSynth to assemble 28 libraries 
of 10,752 genes representing 1152 homologs of 
PPAT and 4992 homologs of DHFR. The number 
of library members with at least one perfect 
assembly and the median percent perfects 
determined by using constructs with at least 100 
barcodes is shown for each library. (B) We 
observed that 872 PPAT homologs (75%) had at 
least one perfect assembly, and 1002 homologs 
(87%) had at least one assembly within a 
distance of five amino acids from design. (C) We 
assembled two codon variants for each designed 
DHFR homolog, allowing us to achieve higher 
coverage. 
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respective designed homologs (fig. S6F). The re- 
sultant error profiles were consistent with Taq- 
derived mismatch and assembly errors that we 
have observed previously (fig. S10) (16). 

We sought to show how DropSynth-assembled 
libraries could be easily coupled as inputs into 
multiplex functional assays by probing how 
well the PPAT homologs of various evolutionary 
distance to Escherichia coli could rescue a knockout 
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phenotype. PPAT is an essential enzyme, encoded 
by the gene coaD, which catalyzes the second-to- 
last step in the biosynthesis of coenzyme A (CoA) 
(fig. S11) (17) and is an attractive target for the 
development of novel antibiotics (78). Assembled 
PPAT variants on the barcoded expression plas- 
mid were transformed into E. coli AcoaD cells and 
screened for complementation by growing the li- 
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dilutions (Fig. 3A and table S1), while a rescue 
plasmid was simultaneously heat-cured (fig. S12). 
Assembly barcode sequencing of the resulting 
populations provided a reproducible estimate 
for the fitness of all homologs successfully as- 
sembled without error (biological replicates 
p = 0.94; Pearson, P < 2.2 x 10°) (figs. SI3A 
and S14A). Individual barcodes can display con- 
siderable noise, so having many assembly bar- 
codes per construct improved confidence (fig. S14, 
B and C). Negative controls and sequences con- 
taining indels show strong depletion (figs. S13A, 
S15A, and S16), and fitness is reduced with in- 
creasing numbers of mutations (p = -0.38; 
Spearman, P < 2.2 x 10°1°) (fig. $15, B and C). 
Pooled fitness scores also correlated well with 
measured growth rates of individually tested con- 
trols [Spearman correlation coefficient (7,) = 0.86, 
P = 5.9 x 10°”) (fig. S17). Approximately 14% 
percent of the homologs show strong depletion 
(fitness below -2.5), whereas 70% have a positive 
fitness value in the pooled assay. Low-fitness 
homologs are evenly distributed throughout the 
phylogenetic tree, with only minor clustering of 
clades (Fig. 3B and figs. S13B, S18, and S19A). 
There are several reasons homologs could have 
low fitness, including environmental mismatches, 
improper folding, mismatched metabolic flux, inter- 
actions with other cytosolic components, or gene 
dosage toxicity effects resulting from improperly 
high expression (supplementary text) (19). 
Errors during the oligo synthesis or DropSynth 
assembly give us mutational data across all the 


Fig. 3. PPAT complementation assay. 
(A) We used DropSynth to assemble a 
library of 1152 homologs of PPAT, an 
essential enzyme catalyzing the 
second-to-last step in CoA biosynthesis, 
and functionally characterized them 
using a pooled complementation assay. 
The barcoded library was transformed into 
E. coli AcoaD cells containing a curable 
rescue plasmid expressing E. coli coaD. The 
rescue plasmid was removed, allowing 
the homologs and their mutants to 
compete with each other in batch 
culture. We tracked assembly barcode 
frequencies over four serial 1000-fold 
dilutions and used the frequency 
changes to assign a fitness score. 

(B) This phylogenetic tree shows 

451 homologs each with at least five 
assembly barcodes, a subset of 

the full data set, in which leaves are 
colored by fitness. Despite having 

a median 50% sequence identity, 

we found that the majority of PPAT 
homologs are able to complement the 
function of the native E. coli PPAT, 

with 70% having positive fitness 
values, whereas low-fitness homologs 
are dispersed throughout the tree, 
without much clustering of clades. 
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homologs, which we can further analyze to better 
understand function. We selected all 497 homo- 
logs that showed some degree of complementa- 
tion (fitness greater than -1) as well as their 
71,061 mapped mutants within a distance of 
five amino acids and carried out a multiple se- 
quence alignment in order to find equivalent 
residue positions. For each amino acid and posi- 
tion, we found the median fitness among all of 
these homologs and mutants. The resulting data 
was projected onto the E. coli PPAT sequence 
(Fig. 4, A and B), providing data similar to deep 
mutational scanning approaches (20, 21). We 
term this approach broad mutational scanning 
(BMS). The average BMS fitness for each posi- 
tion shows strong constraints in the catalytic site, 
at highly conserved sites (p = -0.64; Pearson, 
P< 2.2 x 10°"), and at buried residues compared 
with solvent-accessible ones (p = 0.42; Pearson, 
P = 39 x 10°) (fig. S20, A and B, and supplemen- 
tary text). Surprisingly, some residues that are 
known to interact with either adenosine 5’- 
triphosphate (ATP) or 4’-phosphopantetheine 
turn out to be relatively promiscuous when aver- 
aged over a large number of homologs. Further- 
more, when mapped onto the E. coli structure 
(Fig. 4B), positions known to be involved with 
allosteric regulation by CoA or dimer formation 
show relatively little constraint, highlighting the 
diversity of distinct approaches used among dif- 
ferent homologs while maintaining the same 
core function. We implemented a simple binary 
classifier to predict the sign of the BMS fitness 
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value on the basis of a number of features, 
achieving an accuracy of 0.825 (fig. $21). 

Additionally, we can search for gain-of-function 
(GOF) mutations among those homologs that 
did not complement. A total of 385 GOF mutants 
out of 4658 were found for 55 homologs out of 
129 low-fitness homologs (fitness < -2.5). By align- 
ing these mutations to the E. coli sequence, the 
eight statistically significant residues (34, 35, 64, 
68, 69, 103, 134, and 135) shown in Fig. 4C localize 
to four small regions in the protein structure (fig. 
$22 and supplementary text). We retrieved six 
GOF mutants of six different homologs from the 
library, each with fitness determined from only a 
single assembly barcode, and individually tested 
their growth rates. Five of the six mutants showed 
strong growth, and one failed to complement (fig. 
S17B). We also tested two of the corresponding 
low-fitness homologs, finding increases in the 
growth rate of 10 and 42% for their GOF mutants 
(table $2). 

Broad mutational scanning enabled by DropSynth 
is a useful tool with which to explore protein 
functional landscapes. By analyzing many highly 
divergent homologs, individual steric clashes, 
which might be important to a particular se- 
quence, become averaged across the homologs. 
More broadly, DropSynth allows for building 
large designed libraries of gene-length sequences, 
with no specialized equipment and estimated 
total costs below $2 per gene (tables S3 and S4). 
We also show that DropSynth can be combined 
with dial-out polymerase chain reaction (15), which 
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Fig. 4. Broad mutational scanning analysis. (A) The fitness landscape of 
497 complementing PPAT homologs and their 71,061 mutants (within a 
distance of five amino acids) is projected onto the E. coli PPAT sequence, with 
each point in the heatmap showing the average fitness over all sequences 
containing that amino acid at each aligned position. Mutations are highly 
constrained at a core group of residues involved in catalytic function. Other 
positions show relatively little loss of function, when averaged over many 
homologs, despite known interactions with the substrates. The E. coli wild-type 
(WT) sequence is indicated by green squares, and the average position fitness, 
fitness of a residue deletion, mean EVmutation evolutionary statistical energy 
(22), site conservation, relative solvent accessibility, and secondary structure 
information is shown above. (B) The average fitness at each position, with blue 


0 20 40 60 80 100 120 140 160 

Position (aa) 
and red representing low and high fitness, respectively, overlaid on the E. coli 
PPAT [Protein Data Bank 1QJC and 1GN8 (23)] structure complexed with 
4'-phosphopantetheine and ATP. We observed loss of function for mutations 
occurring at the active site, whereas other residues involved with allosteric 
regulation by CoA or dimer interfaces show large promiscuity, highlighting 
different strategies used among homologs. (C) In addition to complementing 
homologs, we can also analyze mutants of the 129 low-fitness (<-2.5) 
homologs, finding 385 GOF mutants across 55 homologs. We project this data 
onto the E. coli PPAT sequence and plot the number of GOF mutants at each 
position, shaded by the number of different homologs represented. We found a 
total of eight statistically significant positions (residues 34, 35, 64, 68, 69, 
103, 134, and 135) corresponding to four regions in the PPAT structure. 


could be expanded for gene synthesis applica- | 3. K. S. Sarkisyan et al., Nature 533, 397-401 (2016). 2. J.C. Klein et al., Nucleic Acids Res. 44, e43 (2016). 
tions for which perfect sequences are paramount. 4. D. M. Fowler, S. Fields, Nat. Methods 11, 801-807 (2014). 3. H. Kim et al., Nucleic Acids Res. 40, e140 (2012). 

: Peat 5. G. J. Rocklin et al., Science 357, 168-175 (2017). 4. T. H.-C. Hsiau et al., PLOS ONE 10, e0119927 
The bet quality, ers nice es libraries | § s Kosuri, G. M. Church, Nat. Methods 11, 499-507 (2015). 
oa 1Ke. y be BODE OVE rther with investment (2014). 5. J. J. Schwartz, C. Lee, J. Shendure, Nat. Methods 9, 913-915 
in algorithm design, better polymerases, and | 7. S. Ma, N. Tang, J. Tian, Curr. Opin. Chem. Biol. 16, 260-267 (2012). 


larger barcoded bead libraries. (2012). 


REFERENCES AND NOTES Biol. 9, a023812 (2017). 


8. J. Quan et al., Nat. Biotechnol. 29, 449-452 (2011). 
9. R.A. Hughes, A. D. Ellington, Cold Spring Harb. Perspect. 


6. N. B. Lubock, D. Zhang, A. M. Sidore, G. M. Church, S. Kosuri, 
Nucleic Acids Res. 45, 9206-9217 (2017). 

. T. Izard, A. Geerlof, EMBO J. 18, 2021-2030 (1999). 

. B. L. M. de Jonge et al., Antimicrob. Agents Chemother. 57, 


coON 


1. F. Inoue, N. Ahituv, Genomics 106, 159-164 (2015). 10. S. Kosuri et al., Nat. Biotechnol. 28, 1295-1299 (2010). 6005-6015 (2013). 
2. M. Gasperini, L. Starita, J. Shendure, Nat. Protoc. 11, ll. A. Y. Borovkov et al., Nucleic Acids Res. 38, e180 9. S. Bhattacharyya et al., Eng. Life Sci. 5, e20309 
1782-1787 (2016). (2010). (2016). 


Plesa et al., Science 359, 343-347 (2018) 19 January 2018 


4 of 5 


8102 ‘6 Arenuer uo /610' Beweouelos‘eouelds//:djjy WO. pepeojumoq 


RESEARCH | REPORT 


20. D. S. Marks, T. A. Hopf, C. Sander, Nat. Biotechnol. 30, 
1072-1080 (2012). 

21. N. Halabi, O. Rivoire, S. Leibler, R. Ranganathan, Cell 138, 
774-786 (2009). 

22. T. A. Hopf et al., Nat. Biotechnol. 35, 128-135 
(2017). 

23. T. Izard, J. Mol. Biol. 315, 487-495 (2002). 


ACKNOWLEDGMENTS 


This work was supported by the funds from the Human Frontier 
Science Program (LTO00068/2016 to C.P.), Netherlands Organisation 
for Scientific Research Rubicon fellowship (to C.P.), National Science 
Foundation Graduate Research Fellowship under grant 2016211460 (to 


Plesa et al., Science 359, 343-347 (2018) 


A.M.S.), a Ruth L. Kirschstein National Research Service Award 
(GMO007185 to N.L.), National Institutes of Health New Innovator Award 
(DP2GM114829 to S.K.), Searle Scholars Program (to S.K.), U.S. 
Department of Energy (DE-FCO2-02ER63421 to S.K.), UCLA, and 

L. Wudl and F. Wudl. We thank J. Sampson and P. Anderson at Agilent 
Technologies for oligo pools and critical advice. We thank G. Church 
and R. Terry for guidance during the early developments and S. Feng, 
the UCLA Broad Stem Cell Research Center Sequencing Core, and the 
Technology Center for Genomics and Bioinformatics for providing 
next-generation sequencing (NGS) services. S.K. and D.Z. are named 
inventors on a patent application on the DropSynth method 
(US14460496). The scripts required to generate DropSynth oligos are 
available at https://github.com/kosurilab/DropSynth. Sequencing 


19 January 2018 


data are available from the sequencing read archive (SRA) with the 
accession no. SRP126669. 


SUPPLEMENTARY MATERIALS 


www.sciencemag.org/content/359/6373/343/suppl/DC1 
Materials and Methods 

Figs. Sl to S25 

Tables S1 to S14 

References (24-48) 

Movie S1 


28 July 2017; accepted 18 December 2017 
10.1126/science.aao5167 


5 of 5 


810z ‘6} Avenuer uo /Bio' Bewsouelossouel0s//:diyy wo papeojuMOGg 


$25, 000 Grand Prize! 
Get published in Science! 


The Prize is a new highly competitive international prize which honors scientists for 
their excellent contributions to neuromodulation research. For purposes of the 
Prize, neuromodulation is any form of alteration of nerve activity through the 
delivery of physical (electrical, magnetic, optical) stimulation to targeted sites of the 
nervous system.” 


For full details, judging criteria and eligibility requirements, visit: 


www.sciencemag.org/prizes/pins 


Submission Deadline: March 15, 2018 


Science @ pins 33it 


AN AAAS 


ore a — 


Recognize the work of an early career scientist who has 
performed outstanding work in the field of cancer research. 
Award nominees must have received their Ph.D. or M.D. within 


the last 10 years. The winner will deliver a public lecture on his 
AAAS MARTIN AND or her research, receive a cash award of $25,000, and publish 


ROSE WACHTEL a Focus article in Science Translational Medicine. 
CANCER RESEARCH For more information visit 


www.aaas.org/aboutaaas/awards/wachtel 


AWA Fe D or e-mail wachtelprize@aaas.org. 
Deadline for submission: February 1, 2018. 


Science Translational Medicine Avaaas 


Submit Your Research for 
Publication in Science Robotics 


ScienceRobotics.org 


ScienceRobotics 


MV AAAS 


Send pre-submission inquiries 
and expressions of interest to 
sciroboteditors@aaas.org. 


Over 75 New Features & Over 500,000 registered users worldwide in: 

Apps in Origin 2018! = 6,000+ Companies including 20+ Fortune Global 500 
= 6,500+ Colleges & Universities 
m@ 3,000+ Government Agencies & Research Labs 


For a FREE 60-day 
evaluation, go to 
OriginLab.Com/demo 
and enter code: 7564 


25+ years serving the scientific & engineering community 


Life Science Technologies 
new products 


Confocal Laser 
Scanning Microscopes 
The Olympus FLUOVIEW 
FV3000 and FV3000RS 
confocal laser scanning 
microscopes combine high- 
performance imaging ca- 
pabilities with ease of use, 
so researchers can collect 
publication-quality imaging 
data quickly and efficiently. Available as a hybrid system, the FV3000RS is 
equipped with both a galvanometer scanner and an extremely accurate 
resonant scanner that can capture dynamic physiological events at up 
to 438 frames per second. TruSpectral technology is featured on every 
detector, providing the flexibility of spectral imaging while maintaining 
efficient light transmission for excellent sensitivity. Built for fast, stable, 
and accurate measurements of biological reactions within living cells or 
tissues, these microscopes allow even novice users to generate superior 
data and images. 

Olympus 

For info: 704-877-8801 
www.olympus-lifescience.com/en/laser-scanning/fv3000 


Cell Analyzer 

With the Muse Cell Analyzer, you can now achieve highly quantitative 
results at a fraction of the price, effort, and time. It is a compact [foot- 
print of only 8 in. x 10 in. (20 cm x 25 cm)], easy-to-use benchtop device, 
making flow cytometry accessible to anyone, anytime. A user-friendly, 
integrated touchscreen interface, intuitive software for data acquisition 
and analysis, and optimized Muse assays help to simplify your research. 
The Muse's microcapillary flow cell is engineered for acquisition of both 
suspension and adherent cells of 2 um-60 pm in diameter. It uses fluo- 
rescent reagents and detection to measure three parameters for every 
cell, with little or no sample preparation required. Muse assays are avail- 
able for precision cell counts as well as single-cell measurement of criti- 
cal cell parameters, including viability, apoptosis, autophagy, oxidative 
stress, and cell signaling. 

EMD Millipore 

For info: 800-645-5476 

www.emdmillipore.com 


Focused lon Beam Scanning Electron Microscope 

The ZEISS Crossbeam 550 is a focused ion beam scanning electron micro- 
scope (FIB-SEM) that features an increase in resolution for imaging and 
material characterization and a speed gain in sample preparation. Nano- 
structures such as composites, metals, biomaterials, or semiconductors 
can be investigated with analytical and imaging methods. The Crossbeam 
550 allows simultaneous modification and monitoring of samples, re- 
sulting in fast sample preparation and high throughput (e.g., for cross- 
sectioning, transmission electron microscope lamella preparation, or 
nanopatterning). It provides quality 2D and 3D images. The Tandem de- 
cel mode enables enhanced resolution along with maximization of image 
contrast at low landing energies. Gemini II electron optics deliver opti- 
mum resolution at low voltage and high probe current. The FIB column 


Produced by the Science/AAAS Custom Publishing Office 


combines the highest available FIB current of 100 nA with the FastMill 
mode, allowing for precise, efficient material processing and imaging. 
Automated emission recovery increases convenience and optimizes the 
FIB column for reproducible results during long-term experiments. 
ZEISS 

For info: 858-790-7700 

www.zeiss.com/crossbeam 


Proteomics Reagents 

Get higher throughput, greater robustness, and more convenience 

with an Agilent Jet Stream proteomics solution. Based on Agilent’s revo- 
lutionary Jet Stream source, Agilent Jet Stream proteomics solutions 
achieve near-nanoflow analytical sensitivity using 2.1-mm columns and 
standard-flow chromatography. And when sample size is limited and the 
highest sensitivity is required, the new Nanodaptor solution converts 
conventional high-performance liquid chromatography to a nanoflow 
system. Agilent'’s comprehensive proteomics portfolio of hardware, soft- 
ware, automated sample preparation, and columns is designed to meet 
your application needs. 

Agilent Technologies 

For info: 877-424-4536 

www.agilent.com 


AAV Biosensors 

AMS Biotechnology’s adeno-associated virus (AAV) biosensor products 
come as ready-to-use AAV viruses. The viruses encode your chosen bio- 
sensor, either calcium or glutamate, and are ready for in vivo injection. 
Biosensors are genetically engineered fluorescent proteins attached to 
an additional protein sequence that makes them sensitive to small bio- 
molecules (e.g., Ca?*) or other intracellular processes. These biosensors 
are introduced to cells, tissues, or organisms to detect changes through 
fluorescence microscopy. Many biosensors permit long-term imaging 
and can be engineered to specifically target cellular compartments or 
organelles. Additionally, biosensors permit signaling pathway explora- 
tion or allow the measurement of a biomolecule—they do all this while 
preserving both spatial and temporal cellular processes. 

AMS Biotechnology 

For info: +44-(0)-1235-828200 
www.amsbio.com/cellular-metabolism.aspx 


Automated Microscope 

The Lionheart FX Automated Microscope is a compact, inclusive mi- 
croscopy system for a broad range of imaging workflows. It offers up to 
100X air- and oil-immersion magnification, with fluorescence, brightfield, 
color-brightfield, and phase-contrast channels for maximum application 
reach. An optional environmental-control cover provides incubation to 
40°C and effective containment for CO,/O, control; a humidity chamber 
optimizes conditions for long-term, live-cell imaging applications; and 
an available dual reagent injector facilitates rapid kinetic assays. Auto- 
mated image preprocessing optimizes images for downstream analysis, 
from cell counting to characterization of subcellular details. The filter/ 
LED cubes and objectives are accessed from the front panel, and tool- 
free connections for injectors, gas control, and the environmental cover 
add to the easy installation, which takes only 15 minutes. Its small size 
means that minimal benchtop space is used. 

BioTek 

For info: 888-451-5171 

www.biotek.com 


Electronically submit your new product description or product literature information! Go to www.sciencemag.org/about/new-products-section for more information. 


Newly offered instrumentation, apparatus, and laboratory materials of interest to researchers in all disciplines in academic, industrial, and governmental organizations are featured in this 
space. Emphasis is given to purpose, chief characteristics, and availability of products and materials. Endorsement by Science or AAAS of any products or materials mentioned is not 


implied. Additional information may be obtained from the manufacturer or supplier. 


348 19 JANUARY 2018 « VOL 359 ISSUE 6373 


sciencemag.org/custom-publishing SCIENCE 


Submit your high-impact research 
to Science Immunology 


Science Immunology publishes original, peer-reviewed, 
science-based research articles that report critical 
advances in all areas of immunological research, including 
important new tools and techniques. Share your research 
with Science Immunology global readership and submit 
your manuscript today! 


What will your discovery be? 


Submit your manuscript today at 
Sciencelmmunology.org 


Immunology 


MVAAAS 


NEW ENGLAND 


“BioLab Labs. 


Monarch’ Nucl 
Now available 


Designed with — in 
Purification Kits are the perfect 
biology workflows. Avail for DN: 

with buffers and columns availabl *, 
are optimized for excellent performance, c 
Quickly and easily recover highly pure, inta 
in minutes. Available kits include: - ~ 


* Monarch Plasmid Miniprep Kit 
* Monarch DNA Gel Extraction Kit 
* Monarch PCR & DNA Cleanup Kit (5 pg) 


° MONARCH TOTAL RNA MINIPREP KIT — optimize 
with a variety of sample types, including cells, tissues, blood, an 


Make the change and migrate to Monarch today. 


Learn more at 
roduc ee cn er ae ms ntellectual property rights for certain applications. NEBMonarch.com 


One or more of these products ered by patents, trademarks and/or copyrights owned or controlled by New England Biolabs, Inc. 
ore i 


s 
© Copyr ight 2017, New En gland Biolabs, Inc.; all rights 


AVAAAS ee eeerNG 


ADVANCING 
SCIENCE ss 


Advance registration available until January 24 


The 2018 meeting theme highlights the critical roles of academia, 
government, and industry in moving ideas into innovations. 


See Inside for Details: 


President's Address / Registration Rates 
Plenary Lectures / Topical Lectures 
Seminars / Session Tracks / Flash Talk Program 


/610'Beweoualds eoual0s//:d}.y 


PN AAAS | 2018 ANNUAL MEETING 


Join us in Austin 


Learn about the critical roles of academia, 
government, and industry in moving ideas into 
innovations. 


Seminars on technology and innovation; the 
future of artificial intelligence; diversity and 
inclusion; and communicating science 


120+ scientific sessions in 14 disciplinary 
tracks covering the latest research advances 


5 flash talk sessions: dynamic presentations 
and discussion 


Network with colleagues and attend career 
development workshops 


Connect with us 

wW @AAASmeetings #AAASmtg 
FF] facebook.com/AAAS.Science 
aaas.org/meetings 

Reporters: The AAAS Annual Meeting 


Newsroom will be hosted on EurekAlert! 
at eurekalert.org/aaasnewsroom 


PRESIDENT’S ADDRESS 


Susan Hockfield 
Thursday, February 15 


Dr. Susan Hockfield served as president of 

the Massachusetts Institute of Technology 

from 2004 to 2012. As the first woman 
and the first biologist in that role, she highlighted the 
importance of building diversity all along the talent 


pipeline. She fostered cross-disciplinary, cross-institutional, 


and cross-national initiatives, among them the Koch 
Institute for Integrative Cancer Research, the MIT Energy 
Initiative, and the Massachusetts Green High-Performance 
Computing Center, and she co-chaired the White House’s 
Advanced Manufacturing Partnership. By expanding MIT’s 
international education and research activities, including 
the launch of edX, she amplified MIT's global engagement. 
Hockfield avidly advocates increasing interactions across 
the academy, industry, and government. 


Hockfield earned her Ph.D. in anatomy and neuroscience 
from the Georgetown University School of Medicine. She 


Dear Colleague: 


In these changing times, it is critical that academia, government, 

and industry continue to work together to move ideas into innovative 
advancements. This is why the theme of the 2018 AAAS Annual Meeting 
is Advancing Science: Discovery to Application. 


On behalf of the AAAS Board of Directors, | urge you to join us in 

Austin February 15-19, where this theme will be explored through 
interdisciplinary scientific sessions, renowned speakers, and one-on-one 
discussions. The AAAS Annual Meeting is the most widely reported global 
science gathering and the premier event at which you can network with 
future collaborators across disciplines. 


We look forward to seeing you in Austin. Registration and housing details 
are now available online. 


° \ 
ean, Mebftd 
Susan Hockfield 
AAAS President 


President Emerita and Professor of Neuroscience, 
Massachusetts Institute of Technology 


was a National Institutes of Health postdoctoral fellow at 

the University of California, San Francisco, and a member of 
the scientific staff at the Cold Spring Harbor Laboratory in 
New York before joining the faculty at Yale University in 1985. 
At Yale, Hockfield was named the William Edward Gilbert 
Professor of Neurobiology and served as dean of the Yale 
University Graduate School of Arts and Sciences and then as 
provost of the university. 


Hockfield was among the first scientists to apply molecular 
biology to neuroscience, using monoclonal antibodies to 
study brain structure and development. She demonstrated 
that early experience leads to lasting changes in the 
molecular structure of the brain and discovered a gene 
involved in the spread of brain cancer cells into healthy brain 
tissue. 


Dr. Hockfield became a member of AAAS in 1975, was 
elected as a Fellow in 2005, and currently serves as 
president of AAAS. 


AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE « aaas.org/meetings 


8LOg ‘2g Avenuer uo /610' Bewaouslds'e9ua!0s//:d}]}y Wold papeojumoqg 


ADVANCING SCIENCE 


PLENARY LECTURES 


Ellen Ochoa 
Director, 
Johnson Space Center, 


Cori Bargmann 
President, 
Chan Zuckerberg 


DISCOVERY TO 
APPLICATION 


Katherine Hayhoe 
Professor, 


Texas Tech University, 


National Aeronautics Science, Palo Alto, CA Lubbock 

and Space 

Administration, The Chan Zuckerberg When Facts Are 

Houston, TX Initiative: Accelerating Not Enough 
Science 


; Sunday, February 18 
The International 


Space Station: 
A Laboratory in Space 


Saturday, February 17 


Friday, February 16 


TOPICAL LECTURES 


Friday, February 16 Sunday, February 18 


Jason De Leon 
University of Michigan, Ann Arbor 


James P. Allison 
University of Texas M.D. Anderson Cancer Center, Houston 


Nina Kraus Meg Urry 
Northwestern University, Evanston, IL Yale University, New Haven 


Nora Volkow 
National Institute on Drug Abuse, National Institutes of 
Health, Bethesda, MD 


JOHN P. MCGOVERN AWARD LECTURE 
IN THE BEHAVIORAL SCIENCES 


Robert A. Bjork 
University of California, Los Angeles 


810g ‘2g Arenuer uo /610' Bewaoualds'eoua!0s//:d}]}y Wold pepeojumoqg 


Saturday, February 17 


Thomas Maina Kariuki 
African Academy of Sciences, Nairobi, Kenya 


Jed S. Rakoff 

U.S. District Court for the Southern District of New York, 
New York City 

SARTON MEMORIAL LECTURE IN THE 

HISTORY AND PHILOSOPHY OF SCIENCE 


Bruce J. Hunt 
University of Texas, Austin 


Updated program information will 
be posted on aaas.org/meetings 
as it becomes available. 


February 15-19, 2018 » AAAS ANNUAL MEETING « Austin 


SEMINARS 


Communicating Science 

Science and technology are integral to modern life, and 
many critical decisions facing society require finding 
common ground between scientists and members of the 
public. This annual seminar focuses on different aspects 
and approaches to communicating science, always 
emphasizing both theory and practice. The sessions 
provide a forum for scientists, science communication 
and public engagement professionals, and social 
scientists whose research can inform best practices 

to share their expertise and learn from one another. 
Participants gain actionable knowledge and joina 
growing community focused on public engagement with 
science. 


Organized by Emily Cloyd and Elana Kimbrell, AAAS Center 
for Public Engagement with Science, 
Washington, DC 


Reaching Beyond the Science-Interested Public 
Developing a Narrative About Your Data 


Advocating for Public Engagement With Science 


Ideas to Innovation 

This seminar brings together a diverse group of 
sessions around the 2018 AAAS Annual Meeting theme, 
“Advancing Science: Discovery to Application.” Each of 
these panels highlights different ways scientific research 
translates into applications that serve society, and 

how the various sectors within the scientific enterprise 
collaborate to achieve this goal. One session focuses 

on materials research and manufacturing, sharing 
examples of how companies choose what to invest in, 
and how partnerships with academia and government 
have contributed to new technologies. Another session 
explores the stages of technology development and 
transfer across several areas of chemistry, describing 
impacts in the fields of energy, transportation, and public 
health. A third panel traces the path from laboratory 
research to treatments for neurodegenerative diseases, 
and a fourth discusses collaboration in national security 
and defense between government, academia, and 
private industry, which leads not only to technological 
advancement but also to improvements in how social 
science is applied to public policymaking. 


Materials Research for Manufacturing: Lessons From 
Industry on New Product Development 


Organized by Lynnette D. Madsen, Svedberg Science Inc., 
Falls Church, VA 


Technological Applications of Chemistry: Stages of 
Development and Societal Impacts 


Organized by Jonathan Sessler, University of Texas, Austin 


From New Discoveries to New Treatments for 
Neurodegenerative Diseases 

Organized by Benjamin Wolozin, Boston University School 
of Medicine, MA 


The Role of National Security in Strengthening the 
Science and Technology Pipeline 

Organized by Taeyjuana Lyons, U.S. Department of 
Defense, Alexandria, VA 


The Future of Artificial Intelligence 

Artificial intelligence has potential applications and 
implications across nearly all areas of life and science. 
Convening three leaders in the field, one session 
addresses landscape-level questions of how artificial 
intelligence research and development has grown to 
where it is today and how it will meet future challenges 
to progress. One session demonstrates how artificial 
intelligence can improve water resource management 
decisions, and another addresses how to develop 
artificial intelligence technologies in socially responsible 
ways. A fourth session focuses on areas where humans 
will work with machines rather than be replaced by 
them, exploring how the unique strengths of each can be 
leveraged to maximize effectiveness. 


Advancing Artificial Intelligence: From the Lab to the 
Street 


Organized by Henry Kautz, University of Rochester, NY 


Finding Water Management Solutions With Artificial 
Intelligence 

Organized by Suzanne A. Pierce, Texas Advanced 
Computing Center, Austin; Yolanda Gil, University of 
Southern California, Marina del Rey 


The Fourth Industrial Revolution: Supporting Societal 
Needs With Artificial Intelligence 

Organized by Claire Craig, The Royal Society, London, 
United Kingdom 


Artificial Intelligence Augmenting Not Replacing 
People 

Organized by Ann Drobnis, Computing Community 
Consortium, Washington, DC; Gregory Hager, Johns 
Hopkins University, Baltimore, MD 
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Diversity and Inclusion 

Broadening access to science and scientific careers, and 
supporting equity in treatment as well as opportunity 
are critical to individual civil rights, although they also 
benefit science and society at large. These efforts 
include both outreach and recruitment of historically 
marginalized groups (such as people with disabilities, 
underrepresented minorities, women, and those 
identifying as LGBTQ+), as well as changes to the culture 
and structure of scientific institutions and endeavors to 
ensure these individuals are accepted, able to thrive, and 
are reflected in how science is viewed and conducted. 
This seminar convenes sessions discussing various 
aspects of STEM diversity and inclusion to share projects 
and data and gather additional perspectives on the 
range of ways to achieve these goals through funding, 
programs, and everyday action. 


A More Inclusive Science: Examples, Tools, and 
Strategies 

Organized by Jarita Holbrook, National Science 
Foundation, Arlington, VA 


Changing Expectations: The Future of Careers in 
STEM 

Organized by Claire Craig, The Royal Society, London, 
United Kingdom 


Communication Challenges and Opportunities for 
Women in STEM 

Organized by Christine O'Connell, Alan Alda Center for 
Communicating Science, Stony Brook, NY; Amy Landis, 
Colorado School of Mines, Golden 


LGBTQ+ Identities in STEM Fields: Research and 
Implications 

Organized by Rochelle Diamond, National Association of 
Gay and Lesbian Scientists and Technical Professionals, 
Pasadena, CA; Allison Mattheis, California State University, 
Los Angeles 


2018 FLASH TALK PROGRAM 


MANAGING INNOVATION 


Speakers in this flash talk session discuss the range of ways 
research and technology development can be fostered and 
regulated. One talk focuses on a public-private partnership, 
another on a national government effort, and a third on the web 
of existing and potential options for managing one important 
area of science. After these short talks, all attendees participate 
in a group discussion about strategies and considerations for 
efficiently and effectively encouraging innovation. 


TECHNOLOGY TO SERVE THE WORLD 


This session brings together flash talks oriented toward the 
interface between technological advances and societal needs. 
Speakers share a diversity of topics, from space technology to 
medicine-quality screening to applying the Internet of Things 

in agriculture. Each considers how these technologies serve 

the public good in just and equitable ways. After these short 
talks, a group discussion provides an opportunity to learn more 
about the specific technologies and projects, as well as consider 
broader questions about technology development and its 
benefits. 


ADVOCATING FOR SCIENCE 


These flash talks share a variety of motivations and strategies 
for engaging in science policy and science advocacy. Topics 
include a program promoting the value of science to society as a 
method of garnering political support, and research conducted 
during the March for Science to help place the event in a 

larger context and examine its outcomes. Another speaker will 
provide action-oriented tips for researchers interested in policy 
engagement. After these short talks, all participants join for a 
30-minute group discussion. 


COMMUNICATION AND PERCEPTION 


This session considers human communication from 
anthropological, linguistic, and cognitive perspectives. It 
touches on continuing studies and theories of how language 
evolved, how our speech affects the way we perceive and 
receive others, and the science communication implications 

of research on how humans reason, process information, and 
make decisions. After three short talks, a 30-minute discussion 
provides an opportunity for all participants to engage with the 
speakers and one another. 


DEVELOPING ROBOTICS TO ASSIST HUMANS 


In this flash talk session, experts in robotics share several new 
applications of robot technology and explore the associated 
technological and societal questions and challenges. One 
speaker focuses on how autonomous intelligent robots are 
being programmed to learn and interact, while another talk 
centers on the use of robots in disasters and the reasons these 
haven't been deployed very successfully in the United States. 
A third flash talk shares recent research on how machines can 
infer the mental state of humans and thereby enhance human 
performance on military, healthcare, and manufacturing tasks. 
After three short talks, a 30-minute discussion provides an 
opportunity for all participants to engage with the speakers and 
one another. 
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SESSION TRACKS 


Organizers are listed under session titles. 


BIOLOGY AND CHEMISTRY 


Advancing Health and Environmental 
Science Through Standardized 
Laboratory Microbial Ecosystems 
Organized by Trent Northen, Lawrence Berkeley 
National Laboratory, Berkeley, CA; Karsten 
Zengler, University of California, San Diego 


Applying Conservation Genetics and 
Genomics to Wildlife and Fisheries 
Management 

Organized by Abraham J. Miller-Rushing, Acadia 
National Park, Bar Harbor, ME; Kelly LaRue, 
Jackson Laboratory, Bar Harbor, ME 


Applying Insights From Animal 
Behavior to Address Global 
Challenges 

Organized by Vanessa Ezenwa, University of 
Georgia, Athens; John Swaddle, College of 
William and Mary, Williamsburg, VA 


Applying Mass Spectrometry to 
Understanding Complex Cellular 
Processes 

Organized by Livia Eberlin and Jennifer Brodbelt, 
University of Texas, Austin 


Assessing Risk From Chemical 
Exposures: Advances in Technology 
and Gaps in Application 

Organized by Ellen Mantus, U.S. National 
Academies of Sciences, Engineering, and 
Medicine, Washington, DC; David C. Dorman, 
North Carolina State University, Raleigh 


Future Products of Biotechnology 
and Needs for Risk Analysis Science 
Organized by Kara Laney, U.S. National 
Academies of Sciences, Engineering, and 
Medicine, Washington, DC 


Opioid Addiction: Biology, 
Psychology, and Social Policy 
Organized by Yasmin Hurd, Icahn School of 
Medicine at Mount Sinai, New York City 


Synthetic Biology: From Technology 
Development to Risk Governance 
Organized by Katherine Bowman, U.S. National 
Academies of Sciences, Engineering, and 
Medicine, Washington, DC; Igor Linkov, U.S. 
Army Engineer Research and Development 
Center, Concord, MA 


The Science of Art Conservation: 
Preserving Cultural Heritage Objects 
Organized by Eric Breitung, Metropolitan 
Museum of Art, New York City 


CLASSROOM TO CAREER 


Empirical Findings on Science Fairs: 


Experiencing the Nature of Science 
Organized by Frederick Grinnell, University of 
Texas Southwestern Medical Center, Dallas 


Evidence for More Versatile Graduate 


Education and Academic Culture 
Organized by Linda Hyman, Boston University, 
MA; Muriel Poston, Pitzer College, Claremont, CA 


Overcoming Barriers to Change: 
Applying Evidence-Based STEM 
Teaching Strategies 

Organized by Emily Miller, Association of 
American Universities, Washington, DC 


Understanding Your Roots: STEM 
Diversity and an Evidence-Based 
Curriculum 

Organized by Elizabeth Wright and Nina 
Jablonski, Pennsylvania State University, 
University Park 


Women in STEM at Historically 
Black Institutions: South Africa and 
the United States 

Organized by Lindiwe Gama and Cecil Masoka, 
South African Department of Science and 
Technology, Pretoria 


CLIMATE AND THE 
ENVIRONMENT 


Applying Earth Science Models and 
Satellite Observation to Benefit and 
Engage Society 

Organized by Margaret Hurwitz and Danielle 
Wood, NASA Goddard Space Flight Center, 
Greenbelt, MD 


Informing Mitigation and Adaptation 
Options With the Climate Science 
Special Report 

Organized by Donald J. Wuebbles, University of 
Illinois, Urbana 


Involving Stakeholders to Improve 
Outcomes: Lessons From the 
Climate Science Centers 

Organized by Renee McPherson, University of 
Oklahoma, Norman; Katharine Hayhoe, Texas 
Tech University, Lubbock 


Longer-Term Models for Arctic 
Sea Ice and Global Earth System 
Predictions 

Organized by Sim James, U.S. National Earth 
System Prediction Capability Interagency 
Program, Silver Spring, MD; Jessie Carman, 
U.S. National Oceanic and Atmospheric 
Administration, Silver Spring, MD 


Mathematics of Planet Earth: 
Superbugs, Storm Surges, and 
Ecosystem Change 

Organized by Hans Engler and Hans Kaper, 
Georgetown University, Washington, DC 


Mitigating Methane From Super- 
Emitters: California Pilot Projects 
Organized by Riley Duren, NASA Jet Propulsion 
Laboratory, Pasadena, CA 


Understanding Causality to Inform 
Decision-Making Under Uncertainty 
Organized by Adam Douglas Henry, University of 
Arizona, Tucson; Thomas Dietz, Michigan State 
University, East Lansing 


COMMUNICATION, 
LANGUAGE, AND CULTURE 


Cultural and Linguistic Insights 
From the Study of Immigrant 
Languages 

Organized by Joseph Salmons, University of 
Wisconsin, Madison 


Evaluation and Best Practices for 


Training in Science Communication 
Organized by Anthony Dudo, University of Texas, 
Austin 


Exploring Public Fears and Myths: 
Vaccine Hesitancy, Food Safety in 
Fukushima, and Bacteria 

Organized by Miyoko O. Watanabe, Japan 
Science and Technology Agency, Tokyo; Mark 
Ferguson, Science Foundation Ireland, Dublin 


Gender in Translation: How Speech 
Communicates Sex, Gender Identity, 
and Sexuality 

Organized by Nan Bernstein Ratner, University 
of Maryland, College Park 


Natural and Cultural Resource 
Stewardship: New Scientific Insights 
and Audiences 

Organized by Suzanne M. Thurston, AAAS Office 
of Education and Human Resources Programs, 
Washington, DC 


Strategies for Communities and 
Scientists to Collaborate Effectively 
Organized by Cathryn A. Manduca, Carleton 
College, Northfield, MN 


The Impact of Sputnik on Science, 
Technology, and the Public in the 
United States 

Organized by Jon D. Miller, University of 
Michigan, Ann Arbor 


Understanding and Responding to 
Climate Change Denial 

Organized by Heather Akin, University of 
Missouri, Columbia; Matthew H. Slater, Bucknell 
University, Lewisburg, PA 


Visual, Attentional, and Gestural 
Foundations of Signed Languages 
Organized by Richard P. Meier, University of 
Texas, Austin 


DATA AND COMPUTING 


Estimating the Prevalence of Human 
Trafficking in the United States 
Organized by Theresa L. Harris, AAAS Scientific 
Responsibility, Human Rights and Law Program, 
Washington, DC; Davina Durgana, Walk Free 
Foundation, Great Falls, VA 


Experiencing the Future Internet: 
User-Centered Social Television and 
Multimedia 

Organized by David Wizel and Agata Stasiak, 
European Commission, Brussels, Belgium 
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Exploring Universal and Industrial 
Quantum Computing 

Organized by Charles W. Clark, Joint Quantum 
Institute, Gaithersburg, MD; Daniel Rogers, 
Terbium Labs, Baltimore, MD 


How Honey Bees Set Web Servers 
Abuzz 

Organized by Erin Heath, AAAS Office of 
Government Relations, Washington, DC; Josh 
Shiode, Semiconductor Industry Association, 
Washington, DC 


Prospects for Long-Term 
Information Security in the Face of 
Quantum Computing 

Organized by Scott Aaronson, University of 
Texas, Austin; Charles W. Clark, Joint Quantum 
Institute, Gaithersburg, MD 


Rethinking Approaches to Disaster 
Management and Public Safety With 
Intelligent Infrastructure 

Organized by Ann Drobnis, Computing 
Community Consortium, Washington, DC; Daniel 
Lopresti, Lehigh University, Bethlehem, PA 


Transforming Cities, Transportation, 
and Agriculture With Intelligent 
Infrastructure 

Organized by Ann Drobnis, Computing 
Community Consortium, Washington, 

DC; Elizabeth Mynatt, Georgia Institute of 
Technology, Atlanta 


Using Wearable Device Data to 
Analyze and Improve Physical 
Activity and Health 

Organized by Raymond Carroll, Texas A&M 
University, College Station 


ENERGY 


A Sustainable Energy Future With 
Next-Generation, Low-Cost Solar 
Cells 

Organized by Juan-Pablo Correa-Baena, 
Massachusetts Institute of Technology, 
Cambridge; Michael Saliba, Federal Institute of 
Technology, Lausanne, Switzerland 


PAV AAAS | 2018 ANNUAL MEETING 


ADVANCE 
REGISTRATION 
RATES 


General Attendee 


Postdoc 


K-12 Teacher 


Retired Professional 


Student 


One-Day 


AAAS Member 


Rates for members 
in good standing 


$310 $380 
$135 $200 
$135 $200 
$250 $320 
$65 $70 
$175 N/A 
aaas.org/meetings 


New Member 


Includes a year of 
AAAS membership 


Addressing Systemic and Societal 
Factors in Germany’s Energy 
Transition 

Organized by Stefan Sttickrad, Institute for 
Advanced Sustainability Studies, Potsdam, 
Germany 


Bio-Based Industries Joint 
Undertaking: A Model for Catalyzing 
Sustainable Bio-Based Economic 
Growth 

Organized by Sarah Black, Biobased Industries 
Joint Undertaking, Brussels, Belgium; Eleni Zika, 
Biobased Industries Joint Undertaking, Brussels, 
Belgium 


Breakthroughs in Cellulosic Biomass 


and Transportation Fuels 
Organized by Elizabeth Hood, Arkansas State 
University, Jonesboro 


Energy-Enabling Materials and the 
Smart Cities of Tomorrow 

Organized by Olga Rio and Luca Polizzi, 
European Commission, Brussels, Belgium 


ADVANCING 
SCIENCE lasitaricn 


Non-Member 


Rates for all other 
attendees 


$440 
$360 
$360 
$360 
$95 


$220 


Advance registration rates are available until January 24 
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Solving Materials Science 
Challenges to Shape Our Energy 
Future 

Organized by Peter Genzer and James Misewich, 
Brookhaven National Laboratory, Upton, NY 


ENGINEERING AND 
TECHNOLOGY 


Advancing Biopharmaceutical 
Manufacturing With Public-Private 
Partnerships 

Organized by Mike Molnar, U.S. National Institute 
of Standards and Technology, Gaithersburg, MD 


Analyzing Picasso: Scientific 
Innovation, Instrumentation, and 
Education 

Organized by Marc Walton, Northwestern 
University, Evanston, IL; Francesca Casadio, Art 
Institute of Chicago, IL 


Biomedical Sensors: Advances in 
Health Monitoring and Disease 
Treatment 

Organized by Qiaoqiang Gan, State University 
of New York, Buffalo; Zakya Kafafi, Lehigh 
University, Bethlehem, PA 


Closing the Innovation Gap Between 
Research and Industry 

Organized by Kaoru Natori, Okinawa Institute 

of Science and Technology Graduate University, 
Japan 


Future-Generation Cars and Drivers: 
Research, Moving to Market, and 
Policymaking 

Organized by Ingrid Skogsmo, European 
Commission, Brussels, Belgium 


Gateway to Discovery: Research 
Teams Exploring the Edge of 
Technical Feasibility 

Organized by Ben Verschueren, General Electric 
Global Research, Niskayuna, NY 


Harnessing the Transformative 
Potential of Photonics Research and 
Innovation 

Organized by Alex van Nieuwland, EuroTech 
Universities Alliance, Brussels, Belgium; Sandra 
M.J. Buys, Eindhoven University of Technology, 
Netherlands 


Improvements in Earthquake 
Science and Risk Reduction 
Organized by John Anderson, University of 
Nevada, Reno; William Savage, Seismological 
Society of America, Las Vegas, NV 


Open Innovation Ecosystem for 
Advancing Science Through 
Advanced Manufacturing Research 
and Development 

Organized by James Garrett, Carnegie Mellon 
University, Pittsburgh, PA; Sudarsan Rachuri, 
U.S. Department of Energy, Washington, DC 


Oil and Water Do Mix: The Fate of 
Dispersed Oil Droplets in the Sea 
Organized by Edward Buskey, University of Texas 
Marine Science Institute, Port Aransas 


Successful Innovation and 
Commercialization in Industrial 
Science and Technology 
Organized by William Provine, DowDuPont, 
Wilmington, DE 


Sustained Academic, Industry, and 
Clinician Collaboration in Pre- 
Competitive Medical Research 
Organized by Zoran Zvonar, Analog Devices Inc, 
Wilmington, MA; Charles Sodini, Massachusetts 
Institute of Technology, Cambridge, MA 


Using Real-Time GPS for a Global 


Tsunami Early Warning System 
Organized by Michael Angove, U.S. National 
Oceanic and Atmospheric Administration, 
Silver Spring, MD; Gerald Bawden, U.S. National 
Aeronautics and Space Administration, 
Washington, DC 


GLOBAL COLLABORATION 


Building Research Capacity as a 
Critical Component of International 
Development 

Organized by Matt Goode, U.K. Research and 
Innovation, Swindon, United Kingdom 


Cuban Biomedical Science: The Role 
of Science Diplomacy in Translating 
Cures 

Organized by Mark Rasenick, University of 
Illinois College of Medicine, Chicago 


Embracing International 
Partnerships to Achieve Science- 
Based Development in Kuwait 
Organized by Ameena Farhan and Layla 
al-Musawi, Kuwait Foundation for the 
Advancement of Sciences, Sharq 


Instruments of Science and 
Diplomacy: The Importance 

of International Research 
Organizations 

Organized by Jan Marco Muller, International 
Institute for Applied Systems Analysis, 
Laxenburg, Austria 


Issues and Impacts of U.S. Global 
Food Security Policy and Research 
Organized by Nora Lapitan, and Jerry Glover, 
U.S. Agency for International Development, 
Washington, DC 


Migration: A Case for Science 
Diplomacy 

Organized by Jan Marco Muller, International 
Institute for Applied Systems Analysis, 
Laxenburg, Austria 


Science for Sustainable 
Development Goals: Key Lessons 
and Gaps 

Organized by E. William Colglazier, AAAS Center 
for Science Diplomacy, Washington, DC 


MEDICINE AND HEALTH 


Additive Manufacturing and 3-D 
Printing: Medical Technology 
Applications and Impacts 
Organized by Luca Polizzi, European 
Commission, Brussels, Belgium 


Advanced Technology for Oral 
Health Care: Diagnosis, Prevention, 
and Treatment 

Organized by Janet Moradian-Oldak, University 
of Southern California, Los Angeles 


Applying the Science of Genomics 
in Precision Medicine and Cancer 
Treatment 

Organized by William Beck, University of Illinois, 
Chicago 


Effective, Safe, and Underutilized: 
The HPV Vaccine From Development 
to Implementation 

Organized by Joseph Margolick, Johns Hopkins 
Bloomberg School of Public Health, Baltimore, 
MD; Melinda Wharton, U.S. Department of 
Health and Human Services, Washington, DC 


Emerging Cancer Immunotherapies: 
Challenges of Developing Modality- 
Specific Drugs 

Organized by Cris Kamperschroer, Pfizer Inc., 
Groton, CT; Gautham Rao, Genentech Inc., 
South San Francisco, CA 


Emerging Epidemic Pathogens: 
Basic, Translational, and Social 
Science 

Organized by Gerald Keusch, National Emerging 
Infectious Diseases Laboratory, Boston, MA 


Emulating Human Biology: Organ 
Chips for Drug Development and 
Personalized Medicine 

Organized by Geraldine Hamilton, Emulate Inc., 
Boston, MA 


Evolutionary Arms Races: From 


Bacteria to Cancer 
Organized by Susan M. Rosenberg, Baylor 
College of Medicine, Houston, TX 


Faster Responses to Epidemics and 
Bioterrorism With Standardized 
Product Development 

Organized by Jeffrey Fortman, U.S. Department 
of Defense, Washington, DC 


Gene Editing for Xenogeneic Organ 
Production: Regenerating a Patient’s 


Transplantation Organ 
Organized by Alison Van Eenennaam, University 
of California, Davis 


Harnessing the Human Microbiome 
as a Tool for Prevention and 
Treatment of Disease 

Organized by Wendy Cozen, University of 
Southern California, Los Angeles; Rob Knight, 
University of California, San Diego 
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Statistical and Computational 
Challenges in Genomics and 
Precision Medicine 

Organized by Michael Boehnke, University of 
Michigan, Ann Arbor; Michael Epstein, Emory 
University School of Medicine, Atlanta, GA 


Strategies to Accelerate 

the Widespread Adoption of 
Pharmacogenomics in Healthcare 
Organized by Elizabeth Woo, Thermo Fisher 
Scientific, Waltham, MA; Tricia Kenny, Thermo 
Fisher Scientific, Carlsbad, CA 


NEUROSCIENCE 


Advanced Data Analysis Techniques 
for Understanding Brain Function 
Organized by Robert Kass, Carnegie Mellon 
University, Pittsburgh, PA 


Brain Plasticity Revisited: How 


Special Is the Young Brain? 
Organized by Barbara Landau, Johns Hopkins 
University, Baltimore, MD 


Connecting Behavior and the Brain 
to Understand Mental Health 
Organized by Nora Newcombe, Temple 
University, Philadelphia, PA 


New Technologies Emerging From 
the BRAIN Initiative 

Organized by Walter J. Koroshetz, U.S. National 
Institutes of Health, Bethesda, MD 


Optimal Aging and Mechanisms of 


Neurocognitive Enrichment 
Organized by Denise Park, University of Texas, 
Dallas 


“To Sleep, Perchance to Dream”: 
Rewiring the Brain During Sleep 
Organized by Ted Abel, University of lowa, 
lowa City 


Uncovering Novel Epigenetic 
Modifications in Neuropsychiatric 


Diseases 

Organized by Tracy Bale, University of 
Pennsylvania School of Veterinary Medicine, 
Philadelphia 


PHYSICS AND 
ASTRONOMY 


A Universe of Discoveries: Progress 
in Astronomy, Statistics, and 
Machine Learning 

Organized by Chad Schafer, Carnegie Mellon 
University, Pittsburgh, PA 


Asteroids for Research, Discovery, 


and Commerce 
Organized by Martin Elvis, Harvard-Smithsonian 
Center for Astrophysics, Cambridge, MA 


Capturing Dark Matter With 
Quantum Devices 

Organized by Maria Spiropulu, California 
Institute of Technology, Pasadena 


Exoplanets Everywhere: Discovering 
and Characterizing Worlds Beyond 
Our Solar System 

Organized by Jennifer Wiseman, NASA Goddard 
Space Flight Center, Greenbelt, MD 


Innovation in 2-D: Discovering Novel 
Materials and Applications 

Organized by Eva Andrei, Rutgers, The State 
University of New Jersey, Piscataway 


Investigating the Mysteries of 
Antimatter 

Organized by Saeko Okada, High Energy 
Accelerator Research Organization, Tsukuba, 
Japan; Arnaud Marsollier, European 
Organization for Nuclear Research, Geneva, 
Switzerland 


Is There a Future for Humanity in 
Space? 

Organized by Amanda Arnold, Arizona State 
University, Washington, DC 


Revolutionizing Ultrasound 
Applications for Treating Disease 


Organized by Mark Hamilton, University of Texas, 


Austin; Joel Mobley, University of Mississippi, 
University 


Scientific and Engagement 
Outcomes of the 2017 Total Solar 
Eclipse 

Organized by Angela Speck, University of 
Missouri, Columbia; Jay Pasachoff, Williams 
College, Williamstown, MA 


The Chemistry and Physics of 
Nascent Planetary Systems 
Organized by Mark T. Adams, National Radio 
Astronomy Observatory, Charlottesville, VA 


The Technological Applications of 
Glass: From Smartphones to Eagle- 
Eye Vision 

Organized by Barbara Jones, IBM, San Jose, CA 


POLICY 


Balancing Facts and Values in Public 
Policymaking 

Organized by Milena Raykovska, European 
Commission Joint Research Center, Brussels, 
Belgium 


Building Public Trust and Fostering 
Innovation With Transparency 
Organized by Matt Goode, U.K. Research and 
Innovation, Swindon, United Kingdom 


Exploring Perspectives on Open 
Science and Impacts on Scientific 
Discovery 

Organized by Ruth Francis and Shane Canning, 
F1000, London, United Kingdom 


Gene Editing and Human Identity: 
Promising Advances and Ethical 
Challenges 

Organized by Se Y. Kim and Robert O'Malley, 
AAAS Dialogue on Science, Ethics, and Religion, 
Washington, DC 


Industry and Research 
Infrastructures as Co-Creators of 
Innovation 

Organized by Jana Pavlic, European Molecular 
Biology Laboratory, Heidelberg, Germany; 
Antonio Di Giulio, European Commission 
Directorate-General for Research and 
Innovation, Brussels, Belgium 


Recommendations of the U.S. 
Commission on Evidence-Based 
Policymaking 

Organized by Nick Hart, Bipartisan Policy Center, 
Washington, DC 


Reimagining the Innovation 
Ecosystem: Experiments to 
Maximize the Impact of Hard Tech 
Organized by Brenna Krieger, U.S. Department 
of Energy, Washington, DC 


Research and Policy on Voter ID 


Laws and Voter Participation 
Organized by David Marker, Westat, Rockville, 
MD 


Science Activism: Advancing 


Science in a New Political Landscape 
Organized by Carol L. Rogers, University of 
Maryland, Washington, DC 


Science and the Fair Administration 
of Justice 

Organized by Alicia Carriquiry, lowa State 
University, Ames 


Translating Engineering and 
Operations Analyses into Effective 
Homeland Security Policy 

Organized by Sheldon Jacobson, University of 
Illinois, Urbana 


When Regulation Drives Innovation 
Organized by Gerald Epstein, Independent, 
Bethesda, MD 


SOCIAL SCIENCES 


Advancing Interdisciplinary 
Collaboration With Lessons From 
the Field 

Organized by Nancy Nersessian, Harvard 
University, Cambridge, MA; Hanne Andersen, 
University of Copenhagen, Denmark 


Behavioral Challenges and Solutions 
to Science-Based Action 

Organized by Kateryna Wowk, Texas A&M 
University, Corpus Christi, TX 


Economic Analysis of Links Between 
Patents, Publications, and Policy 
Organized by Pierre Azoulay, Massachusetts 
Institute of Technology Sloan School of 
Management, Cambridge; Bhaven Sampat, 
Columbia University, New York City 


Implications of Evidence About Drug 
Use Hot Spots, Gerrymandering, and 
Gang Violence 

Organized by William Alex Pridemore, State 
University of New York, Albany 
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Pre-Election Polling Uncertainty: An 


Interdisciplinary Analysis 
Organized by Andrew Gelman, Columbia 
University, New York City 


Social Inequality and Obesity: 
Effects of Poverty and Uncertainty 
on Health 

Organized by Gregory Pavela, University of 
Alabama, Birmingham 


Survey Data Collection: Theoretical 
and Practical Perspectives 

Organized by Florian Keusch, University of 
Mannheim, Germany; Frauke Kreuter, University 
of Maryland, College Park 


The Role of Conspiracy Theories 

in Perceptions of Fake News About 
Science 

Organized by Jason Reifler, University of Exeter, 
Devon, United Kingdom; Asheley Landrum, 
Texas Tech University, Lubbock 


Using Social Networks to Improve 
Response to Natural Disasters 
Organized by Branda Nowell, North Carolina 
State University, Raleigh; Toddi Steelman, 
University of Saskatchewan, Saskatoon, Canada 


What Citizens Think About Science: 
Survey Data and Implications for 
Communicators 

Organized by Cary Funk, Pew Research Center, 
Washington, DC; John C. Besley, Michigan State 
University, East Lansing 


SUSTAINABILITY AND 
RESOURCES 


Airborne Laser Mapping Research in 
Tropical Forests and Archaeological 
Sites 

Organized by Timothy Beach and Sheryl 
Luzzadder-Beach, University of Texas, Austin 


Dealing With Deadly Pests Through 
the Sterile Insect Technique 
Organized by Erin Heath, AAAS Office of 
Government Relations, Washington, DC; Josh 
Shiode, Semiconductor Industry Association, 
Washington, DC 


From Lab to Farm to Table, and Back 
Organized by Elizabeth (Toby) Kellogg, Donald 
Danforth Plant Science Center, St. Louis, MO 


From the Bioreactor to the Plate: 
Addressing the Global Food 
Challenge With Algae 

Organized by Rahel Byland, ETH Zurich, 
Switzerland 


Implications of Water Cycle Science 
and Technology for Resource 
Management 

Organized by Marty Ralph, Scripps Institution 
of Oceanography, La Jolla, CA; Xubin Zeng, 
University of Arizona, Tucson 


PAV AAAS | 2018 ANNUAL MEETING 


AAAS, publisher of Science, thanks the sponsors and 
supporters of the 2018 Annual Meeting 
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Overcoming Challenges to Using 
Environmental Big Data for 
Conservation 

Organized by Richard Feldman, Center for 
Scientific Research, Yucatan, Mexico; Abraham 
J. Miller-Rushing, Acadia National Park, Bar 
Harbor, ME 


Phytobiome Research to Improve 
Agricultural Productivity 

Organized by Jan E. Leach, Colorado State 
University, Fort Collins; Kellye Eversole, Eversole 
Associates, Bethesda, MD 


Preparing for Floods and Droughts 
With NASA's “Scale in the Sky” 
Organized by Jay Famiglietti, California Institute 
of Technology, Pasadena 


Science Across Borders: Bilateral 
Collaboration and the U.K.-China 
Agri-Tech Project 

Organized by Terry O'Connor, U.K. Science and 
Technology Facilities Council, Swindon, United 
Kingdom 


The Importance of Tradeoffs in 
Managing Food Security 

Organized by Margaret C. Nelson, Arizona State 
University, Tempe 


The View From Above: Exploring the 
Future of Drones 

Organized by Robert Hickey, Central Washington 
University, Ellensburg 


2018 Annual Meeting app is now available: 


aaas.org/app 
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All ads submitted for publication must comply with 
applicable U.S. and non-U.S. laws. Science reserves 
the right to refuse any advertisement at its sole 
discretion for any reason, including without limitation 
for offensive language or inappropriate content, 

and all advertising is subject to publisher approval. 
Science encourages our readers to alert us to any ads 
that they feel may be discriminatory or offensive. 
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ScienceCareers.org 


Chair, Dept. of Cell & 
Regenerative Biology 


The University of Wisconsin School of Medicine and Public 
Health (SMPH) invites applications and nominations for the 
position of Chair of the Department of Cell and Regenerative 
Biology (CRB). 


The Department of Cell and Regenerative Biology is committed to 
understanding the fundamental mechanisms by which living 
systems operate at cellular and molecular levels of organization. 
By embracing a wide range of contemporary and emerging 
approaches and experimental systems, the department seeks to 
define signaling and regulatory pathways that provide the basis for 
understanding, diagnosis and treatment of human disease. Basic 
research is the centerpiece of the Department and serves as the 
driving force behind teaching and training efforts. The overarching 
research interests of the Department are highly interdisciplinary, 
emphasizing molecular, cellular and systems approaches to 
describe biological processes in molecular terms. To maintain its 
excellence and stature, the department is currently focusing on 
existing strengths in four research areas: Cell and Molecular 
Biology, Developmental Biology, Stem Cell and Regenerative 
Biology, and Cardiovascular Biology. 


We seek a recognized leader with an outstanding academic 
background, strong research credentials, demonstrated commitment 
to education, experience in mentoring junior faculty, and proven 
leadership and management skills. The chair will provide 
professional and administrative leadership of the highest quality 
to this distinguished department in its teaching and research. 


The successful candidate will have a compelling vision for the 
future of CRB in a leading academic medical center. Candidates 
must have a Ph.D., M.D., MD/Ph.D. or equivalent degree(s). 
They should possess substantial background and experience in 
administrative leadership, research, and teaching, and a strong 
academic background that would qualify for appointment as a 
tenured professor at the University of Wisconsin-Madison. 


Please send nominations to: Terri Young, MD, MBA, and David 
Gamm, MD, PhD, Co-Chairs of the CRB Chair Search Committee, 
c/o Staci Andersen, 4150Q HSLC, 750 Highland Avenue, 
Madison, WI, 53705- 2111, or use: slandersen2@wisc.edu 


To apply for this position, please use the University of Wisconsin 
applicant tracking system which is found at the link below. Click 
the “Apply Now” button. Applicants will be asked to upload a 
current cover letter, CV, and list of three references. 


Application Link: jobs.hr.wisc.edu/crbchair 


Applications from minorities and women are strongly encouraged. 
To receive full consideration, applications should arrive by 
January 31, 2018 


Unless confidentiality is requested in writing, information regarding 
applicants must be released upon request. Finalists cannot be guaranteed 
confidentiality. Wisconsin Caregiver Law applies. The University of 
Wisconsin is an equal opportunity, affirmative action employer. For more 
information: med.wisc.edu, crb.wisc.edu. 
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Advance your 
career with expert 
advice from 

Science Careers. 


Download Free Career Advice Booklets! 
ScienceCareers.org/booklets 
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FROM THE JOURNAL SCIENCE JAVAAAS 


UAMS 


UNIVERSITY OF ARKANSAS FOR MEDICAL SCIENCES 


ASSISTANT PROFESSOR 
Faculty Positions in 
Biochemistry and Molecular Biology 


Two tenure-track, Assistant Professor positions are available in the 
Department of Biochemistry and Molecular Biology (http://www.uams. 
edu/biochem/) at the University of Arkansas for Medical Sciences. 
The positions include competitive salary, benefits, start-up packages, 
research space, and access to state-of-the-art core facilities, including a 
National Resource for Proteomics. The positions will be associated with 
the Winthrop P. Rockefeller Cancer Institute (http://www.cancer.uams. 
edu). Little Rock has an area population of 600,000, an affordable cost 
of living, many cultural amenities, and beautiful natural surroundings. 


Candidates must possess a PhD and/or MD degree and postdoctoral expe- 
rience. We seek highly qualified biochemists/molecular biologists who 
will establish an internationally leading research program on mechanisms 
of biomedically important processes and who will contribute to teaching 
medical and graduate students. Particular areas of interest include cancer 
biology, developmental biology, pediatric diseases, signal transduction, 
membrane traffic, DNA damage response, epigenetic mechanisms and sys- 
tems biology. Cancer-related applications will be particularly responsive. 


Applicants should submit curriculum vitae, a brief statement of proposed 
research, and arrange for three reference letters to be sent to (electronic 
submission preferred): Biochemistry Search Committee, Department 
of Biochemistry and Molecular Biology, University of Arkansas for 
Medical Sciences, 4301 W. Markham St., Little Rock, AR 72205. 
E-mail: Biochemsearch@uams.edu. 


University of Arkansas for Medical Sciences is an Equal Opportunity/ 
Affirmative Action Employer. 
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The 2018 Tinker-Muse Prize 
for Science and Policy in Antarctica 


The “Tinker-Muse Prize for Science and Policy in Antarctica’ is a 

USD $100,000 unrestricted award presented to an individual in 

the fields of Antarctic science and/or policy who has demonstrated 

potential for sustained and significant contributions that will 

enhance the understanding and/or preservation of Antarctica. The 

Prize is inspired by Martha T. Muse’s passion for Antarctica and is 
a legacy of the International Polar Year 2007-2008. 


The prize-winner can be from any country and work in any field of 

Antarctic science and/or policy. The goal is to provide recognition 

of the important work being done by the individual and to call 

attention to the significance of understanding Antarctica in a time 

of change. A website with further details, including the process 

of nomination, closing date and selection of the Prize recipients, 
is available at www.museprize.org. 


The Prize is awarded by the Tinker Foundation and administered 
by the Scientific Committee on Antarctic Research (SCAR). 
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" Alist of 20 scientific career paths with a prediction of which ones best fit your skills and interests 


" Atool for setting strategic goals for the coming year, with optional reminders to keep you on track 
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Articles and resources to guide you through the process 


" Options to save materials online and print them for further review and discussion 


= Ability to select which portion of your IDP you wish to share with advisors, mentors, or others 


" Acertificate of completion for users that finish mylDP. 


Visit the website and start planning today! 
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WORKING LIFE 


By John Tregoning 


362 


From parade ground to PI 


t the end of my postdoc, I excelled at moving tiny amounts of colorless fluid from one tube 

to another at strange times of day, and I had a vague sense of where I wanted to go scientifi- 

cally. But my experience in the lab wasn’t sufficient to equip me to run my own group. That 

preparation came from my double life as a British Army Reserve officer. Admittedly, I learned 

several things in the army that have not been so useful in a research lab, including how to 
march in time with others, shout very loudly, and swear with an unrivaled range of colorful expletives. 
Critically, though, my time in the army taught me how to be a leader—for good and ill. 


One piece of military manage- 
ment advice that has stuck with 
me is that “it’s not a popularity 
contest, sir” In the army, I some- 
times had to make unpopular 
decisions, such as who stayed be- 
hind to guard the barracks while 
everyone else went on rest and 
recreation. Getting the job done 
was more important than being 
universally liked. I now draw on 
this mindset in running my lab, 
whether I need to assign people 
to empty the garbage or decide 
who should get first authorship 
on a paper. 

My military experience also 
helps me prioritize my time in 
the best interests of the lab. This 
often comes down to knowing 
when to lend a helping hand with 
a big experiment—the popular 
choice—and when the lab will be better served if I stay 
in the office to finish that paper. More broadly, the army 
taught me that sometimes there is no best decision, only a 
least worst decision. This helps me make difficult choices in 
science, such as when to ditch a study. 

I also learned the importance of the team. Science can 
feel like a solitary activity. You may find yourself working 
in the vicinity of other people—the one who leaves the lab 
a mess, the one who books the key piece of equipment for 
3 days solid and then doesn’t use it, the one who talks too 
loudly about inappropriate things in the communal office— 
but this isn’t the same as working with other people. In the 
army, on the other hand, working in stressful environments 
with people who were tired, cold, and wet fostered cama- 
raderie and understanding that helped us be more effective 
individually and as a group. 

So, as a group leader, I have prioritized team-building. 
I have found that getting to know my team and building 
rapport—breaking bread, going to the pub, doing things 


“My time in the army taught 
me how to be a leader.” 


together outside the lab—makes 
it possible for me to stress them 
from time to time, for example 
by making demands for data and 
paper revisions and setting dead- 
lines for experiments. But, hav- 
ing stressed the team, I have also 
found that it is then necessary to 
rebuild. To do this, I apply the 
ABCs: acknowledgment (telling 
people they’ve done well), booze 
(a bottle of Prosecco goes a long 
way), and cake. 

Of course, the lab is not the 
army. Among other differences, 
academia is less directive and 
more discursive. Discussion leads 
to better science, but I sometimes 
find the lack of hierarchy a chal- 
lenge when it comes to the day- 
to-day work of running the lab. 
People don’t simply do things I tell 
them to just because of my position, and shouting at them 
in my best parade ground voice (plus expletives) is equally 
ineffective. Telling one of my students to “JFDI” (look it up) 
when we were arguing about the best way to do an assay 
did not have the desired effect—quite the opposite, and it 
took a bit of ABC to fix this. So, I have learned to temper my 
inner sergeant major, though he does tend to rear up when 
I am overworked and stressed. 

Ninety percent of being a good lab head is being able 
to direct your team. Although I’ve needed to adjust my 
leadership style for different surroundings, having the 
opportunity to develop a leadership style in the first place 
has been a real advantage. The army is not for everyone. 
But if you want to learn to lead, I would recommend get- 
ting out of the lab. 


John Tregoning is a senior lecturer at Imperial College 
London. Do you have an interesting career story? Send it 
to SciCareerEditor @aaas.org. 
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